Rise of the machines – the state of machine translation (Report preview for Project Opus)

Introduction – 

Machine translation is perhaps the most rapidly evolving space in the language services industry. It is easy to get lost with new innovations being announced each day. This report will help you stay up to speed without having to research for days. 

In this Insight Report, we look at top machine translation providers and how they stack up. Specifically we analyze language coverage, pricing, integrations and of course output quality. 

Information contained in this report:

  1. TL;DR
  2. Background
  3. How LSPs make money
  4. Market size
  5. Types of MT
  6. Ranking quality output
  7. Language coverage
  8. CAT Tool integration
  9. Pricing
  10. Summary

Note that in this report we look at output quality for the 11 top machine translation providers for 35 language pairs, both from and into English. To get this information Nimdzi has partnered with Inten.to. 

“Intento benchmarks Cognitive Services and provides a single API to use all of them. We help to discover and use the best Artificial Intelligence services for every task without spending human effort on integration and switching providers.”

We know you are busy! That’s why each of our reports are formatted so that information can be quickly and easily digested. For those in a hurry, we provide the TL;DR (Too long; didn’t read!) section in the beginning of each report. At the end of each report we also summarize key points with our own Insights.

For those that think the devil is in the details, we have you covered, too! In the body of the report we go into as much detail as possible about each topic discussed. Still not satisfied? Never hesitate to reach out to us directly to request additional information. We’re here to help!

No time to read the full report? Watch and listen! – 

TL;DR

Market size

There is no consensus

Estimates for the size of the machine translation market vary widely. Estimating market size is difficult because it is an industry that is constantly evolving. Rapid growth and new entrants entering the market make for a moving target that is hard to hit. 

LSP Opportunities

Translating more

LSPs do not have to see machine translation as a threat. Rather, it can be an opportunity. Machine translation allows LSPs to improve margins internally by increasing translator efficiency, and also also allows for translation of high volume content that previously would have been cost prohibitive, such as e-commerce content and user generated content. 

Impending boom

Machine Translation about to explode

Existing players are reporting record growth and new big players continue to flock to the market. Amazon has recently announced a solution to rival Google Translate, and Apple is testing their own technology, expected to release soon. 

Quality

Comparing engines by quality

Perhaps the most important aspect of evaluating machine translation engines is evaluating the quality of the output. Output quality varies based on many factors, such as engine, language pair, and source content. This report looks at quality data from 11 different MT engines, covering 35 language pairs. There are some clear winners and losers, though it is important to remember that this evaluation was performed on only one content type and different engines will be more effective with different content types.

Pricing

Cost models for MT providers

Pricing for MT providers can follow different models. The simplest is a simple markup on top of translation rates, which are already pretty well standardized. Other models include billing for support, on-premise servers, and hosted SaaS models. There are also diverse pricing models for post editing services. There is not yet a consensus on whether it makes sense to charge for PE services on a per word, per hour, or as a hybrid with existing TM match pricing models. 

Integrations

Playing nice with other tools

Machine translation is only useful if you can use it. This means it somehow needs to be integrated into existing translation workflows and tool-sets. We look at the level of integration of over 35 providers with 10 of the top translation management systems to see who plays nice with who. 

Coverage

Which language pairs are covered

Machine translation engines are built for individual languages. Therefore, it is important to make sure to pick an engine that will handle all of thevlanguage pairs you need. Google, Yandex, and Microsoft offer the most language pairs, both in total and unique. Unique language pairs are defined as those which are only offered by one engine. If one of those unique language pairs is a “must have” it is important to know which engine is the one that will offer it. 

Evolution

From RbMT to NMT and beyond

Machine translation has come a long ways over the years. Today, we have many different types of machine translation beyond rules-based and statistical. Neural machine translation is a game changer, and when combined into hybrid NMT-SMT engines, can be even more powerful. 

Background

Google Translate processed 146 billion words a day in 2016. That is already three times more than all professional translators in the world together can do in a month. Since then, the scale has only been growing.

For those of you that are worried about machine translation taking over, we are sorry to report that the doomsday scenario has already snuck up on us. The machines have already surpassed us.

Contrary to common expectations, the machines have not stolen all of our jobs. Global demand for language services continues to increase every year. The demand for quality linguists is at record highs. Machine translation (at least in its raw form) does not yet match human translation, and human professional translators are still required for projects that require quality translations.

Technology is not replacing translators. Translators using technology are replacing those who don’t.

However, that doesn’t mean professional translation is completely safe from MT or that professional services providers can ignore MT. With post-editing for most types of content, and raw MT for user-generated content, it is becoming urgent for translation companies and global brands to understand and harness the power of machine translation.

How LSPs make money with MT

Based on our interviews, the clear majority of translation companies see machine translation primarily as a threat. They do not yet generate a lot of new revenue by selling machine translation solutions. Instead, the quickly growing volume of post-editing work is cannibalizing income from professional translations. PEMT often warrants lower margins, partly because the service is new and requires adaptation from LSP vendor management, project management and sales. That’s why established LSPs often try to slow down MT adoption into buyer workflows, citing “client miseducation” about MT.

LSP opportunities with MT are connected with using it internally to improve efficiency and margins, and with making available projects that were not feasible before due to schedule and budgetary constraints.

Most common uses for MT:

 

Internal

Supplementing TMs

The low-hanging fruit is to pre-translate numbers, addresses, countries, tags, and similar content with MT. In conjunction with translation memory, MT can produce even better savings. If the LSP introduces MT ahead of their buyer, they can lower translator costs while still charging full price to the customer.

UGC

Translating user generated content

To sell more, marketers in consumer industries increasingly rely on user-generated content such as reviews, social media posts and community guides to their products. An unscripted review from a happy user promotes products better than a shiny official landing page full of marketing praise written by a third-party copywriter. Marketers will look to providers that can offer them an MT solutions to expand the reach of these user interactions.

E-commerce

High-volume product descriptions

Online commerce continues to expand faster than brick-and-mortar shops. E-stores have thousands of items in stock, sometimes, hundreds of thousands. When they want to enter a new market, it’s impossible to translate the whole stock in a short period of time and on a feasible budget. MT is the answer.

How much money is there in MT?

The MT software market is quite small, estimates ranging from USD 130 million to USD 400 million. This is a grain of sand compared to USD 25,000 – 40,000 million in professional translation services.

Industry outsiders typically give MT market a higher valuation, and insiders are more modest. For example, Global Market Insights predicted MT to hit USD 1.5 billion by 2024. Industry insider think tank TAUS made their first estimate in 2015 at USD 250 million, and corrected it down to USD 130 million two years later. They foresee a 6 percent growth rate for the coming years.

The reality is that estimates vary widely and every number we have seen so far is an educated guess. Below are some of the projections that have been published over the last three years. 

2015 market sizing estimates

Projected USD 983.3 million by 2022

Grand View Research

Estimating that the industry will grow close to USD 1billion by 2022.

USD 250 million in 2015

TAUS

Estimated USD 250 million in 2015

2016 market sizing estimates

TAUS

Estimated USD 400 million in 2016 with a 19 percent growth rate leading to USD 1.5 billion by 2024.

2017 market sizing estimates

USD 123 million

P&S Market Research

Estimated USD 123 million for previous year (2016) with a 6.7 percent growth rate

USD 130 million in 2017

TAUS

Estimated USD 130 million in 2017, with a 6 percent growth rate prediction

The reasons to believe MT market is small are legitimate. With the exception of Google, Microsoft and defense contractors such as Raytheon BBN Technologies, most MT providers are microscopic companies. Important players like Omniscien, Kantan MT, Iconic, and Globalese are in the low millions USD in terms of revenue. We estimate SDL’s machine translation software business to be at around USD 6 million. Larger providers with decades-long history such as Systran and Promt never scaled past USD 10 – 15 million.

By all estimates, the money in pure MT is negligible compared to what it does for language services and how it disrupts the industry. However, it should not be viewed in isolation. MT enables post-editing services which could potentially completely transform the professional translation market.

Will there be a boom in MT revenues?

Yes, there will be.

Even without any new advancements into this market, we are already seeing growth being driven by trends in the industry: post-editing, increasing volumes, new markets, new innovation, and new integrations

MTPE

Switching to post-editing

Translators are gradually becoming post-editors. This trend will continue as MT output quality improves.

Increased volume

Demand is skyrocketing

Business and products are increasingly global, and there isn’t enough humans to translate everything manually.

New markets

Localizing into more languages

Indian, African

Integration

More in-app MT

More and more apps are including machine translation options directly in their app, facilitating easy communication where there used to be strong language barriers.

Innovation

Leading industries pave the way

Leading industries: software, ecommerce, travel, media & stock information, basic healthcare

In addition to these pre-existing growth drivers, we are seeing even more growth being driven by technological advancement and new entrants to the market. Google’s introduction of Neural machine translation led to an explosive growth in mainstream business media coverage of MT. Buyers became more aware of MT capabilities and their interest has been piqued. After all, the idea is that universal Star Trek translator is around the corner, and that after 60 years of promises the AI has finally cracked the problem of human language. This news functions as a self-fulfilling prophecy.

In 2016 SDL’s machine translation technology has seen a 72 percent increase. SDL’s annual report does not reveal actual numbers, but taking into account the growth rates, we have calculated that they moved from ₤2.8 million to ₤4.5 million (USD 6 million) a year.

In 2017 Amazon introduced Amazon Translate, a service to rival Google Translate. The announcement arrived at the end of the year, but about two years’ worth of preparation and investment went into this system: in 2015 Amazon acquired MT company Safaba Solutions, and later opened a dedicated “MT office” in Pittsburgh.

The next colossus to enter the MT game will be Apple. As of the time of writing this, they were beta-testing an MT service internally, and hiring more scientists in the language technology field for their Munich office.

IT giants such as Amazon and Apple don’t invest resources into products unless they expect to gain a billion dollars in return. So, while other industry think tanks are predicting 6 – 7 percent annual growth, we believe the potential of MT is much larger.

Types of MT

Whether you view machine translation as a threat or an opportunity, it is important to understand how it has evolved over the years. The evolution is not yet complete, of course. Machine translation improves every day. In order to provide context to our conversation, we will first look at the different types of machine translation that have been developed, classified by technology, specialization and usage scenarios. 

Classified by technology

First we will look at the types of MT as classified by the type of technology used. Below is a list in order of development, broken down into three generations of MT innovation. 

First generation

RbMT

Rule-based MT

Uses countless algorithms based on language grammar, syntax, and phraseology. Good for repetitive content, such as ecommerce product names, but adding new language combinations takes years.

Second generation

SMT

Statistical MT

Pattern-matches millions of reference texts to find translations that are statistically most likely to be suitable. Training new engines is easy provided there is enough reference material.

RbMT/SMT

Hybrid rule based and statistical MT

A combination of statistical with added custom rules on top.

Meta

Meta-language approach

Experimental approach, translates into semantic machine language first, then to another language.

Third generation

NMT

Neural Machine Translation

Uses machine learning technology to teach software how to produce the best result. This process consumes a lot of processing power, and that is why it’s often run on graphics units of personal computers. NMT arrived in 2016, and most MT providers are now switching to this technology.

NMT/SMT

Hybrid neural and statistical machine translation

A combination of Neural and Statistical

EBMT

Example based machine translation

Adaptive

Special type: adaptive machine translation

Adaptive MT works in CAT-tools and functions similarly to autosuggest. It offers suggestions to translators, and learns continuously from their selections. Both SMT and NMT systems can be made Adaptive.

Classified by specialization

When classifying machine translation by specialization, it is useful to look at how the engines are trained and used. Different users will have different content types and, therefore, different requirements. Below are some examples of machine translation providers that can be classified as providing generic, custom, and domain-specific solutions.

Type Explanation Example vendors
Generic Doesn’t have specialization, doesn’t follow professional terminology. Equally good for anything. Google Translate, Microsoft Translator
Custom Uses client-specific terminology and fits their needs best, but requires training and maintenance. KantanMT, Omniscien, Iconic, Globalese
Domain-specific Ready-made specialized engine. Prioritizes terminology specific to the selected industry (medical, legal, automotive). Good with the chosen type of texts, bad with others. Promt, Systran, most of custom engines

 

Classified by usage scenarios

For a more practical view of how you can use machine translation, it is useful to classify different engines by usage scenario. Below are some examples of how to use specialized machine translation providers for different use-cases that you may have. 

Scenario Example
Standalone desktop or mobile app Promt mobile translator
Web portal Google Translate
API Almost any MT system
Backend system Intel multilingual forum search
Integrated into a CAT-tool XTM + Crosslang, memoQ + Omniscien
Integrated into browser Chrome + Google Translate
Integrated into another app Facebook post translation, Tripadvisor review translation
Tied in with hardware Google Buds, Waver.ly earphones

Which MT systems offer the best quality

To answer this question Nimdzi Insights has licensed a report from Inten.to comparing translation quality of 11 systems in 35 language combinations. In 2017, Nimdzi, in partnership with Inten.to, ran a series of tests using news as proxy for general business content. DeepL, Google Translate, Yandex, gtcom and Microsoft scored higher that the rest in overall performance. It’s important that a startup company DeepL scores higher than Google Translate – this shows that it’s possible to beat the giant with a quality database, even if it covers only a few languages.

It is very important to note here that the quality was tested with news articles in order to get favorable results for generic engines. Domain engines (ex. automotive) like those used by Promt, Systran, or SAP would score much higher in testing with their respective subject matter area. 

The good news though, is that if you have a specific content type that you would like evaluated, this can easily be done. Please contact us to run a specialized test with content of your choice.

Evaluating the quality of MT

MT systems developers test daily to see if new data and algorithms lead to improvements in output quality. Due to the sheer number of tests, human evaluation is usually out of the question, and developers use automated metrics instead.

The most commonly used metric is BLEU. “Bilingual evaluation understudy” shows how closely MT translation corresponds to human. It compares parallel translations and produces a score between 0 (worst) and 1 (best). While BLEU scores are widely used by MT researchers, they can be manipulated, and it takes a specialist to make sense of results. Besides, BLEU favors statistical models over Neural.

BLEU has spawned many derivative models, including METEOR, ROUGE, HyTER, NIST and LEPOR (which is the metric used in the above evaluation). Derivative models attempt to fix some of the drawbacks of BLEU. However, BLEU has the advantage of longevity. Developers using it can compare the performance of their systems all the way back to early stages of MT development, and can clearly estimate the progress, with statements for example, like, “we’ve improved quality by 20 percent over existing systems”.

For actual use it is important to run human evaluation in parallel with machine tests.

Language coverage

MT launched by search engines Google, Yandex and Baidu and Microsoft offer the best language coverage, because through search they have access to the widest selection of parallel texts.

They don’t cover directly all available combinations, and instead translate them via a middle language such as English. This process is called zero-shot translation.

Integrations

Google Translate and Microsoft Translator are integrated in most CAT tools. Other tools offer a varying set of CAT-tool integrations, with Kantan and Omniscien (ex-Asia Online) at the forefront.

Pricing 

When we talk about pricing for machine translation, we could look at two different aspects of pricing:

  • Pricing for the machine translation as a service (ie: raw), and;
  • Pricing for post editing of the machine translation 

Pricing of raw machine translation

There are several different ways that machine translation providers choose to price machine translation.

APIs

Google, Microsoft, IBM and many others sell MT by million characters.

On-premise servers

In addition to APIs companies like Promt and Systran offer standalone enterprise translation servers that can be deployed on the client’s hardware and used without per word limit. Server pricing is hidden, and typically starts at USD 15,000 – USD 30,000 per project. Purchasing servers makes sense in the following scenarios:

  • High volume, preferably > 2,000,000 characters a year
  • High security requirements – machine translating confidential information that should not leave the premises

These companies also offer engine training services that are priced per hour.

Hosted SaaS

Custom machine translation vendors such as KantanMT and Omniscien offer browser suites to train MT engines and connect them to various environments. Customers can log in, upload their translation memory and add it to an existing baseline engine to train custom MT. The custom engine can then function separately, or be plugged into a TMS. Web solutions come on a subscription basis, and their price starts at USD 1,500 – 3,000 a month.

Pricing of post editing

Machine translation post-editing (PEMT) rates are not a standard etched in stone, and many translation companies struggle correctly evaluating the internal cost and offering a respective price to the client. There are three main approaches to pricing PEMT.

Per word translation costs

The simplest and most common way to charge for machine translation post editing would be on a per word basis. Often this is represented as a percentage of the standard human translation rate, such as 20 – 30 percent of the translation rate per word.

Advantages

This pricing strategy is very simple and it is also very familiar. Buyers are used to thinking in terms of per-word costs when it comes to translation. This leads to predictable costs and a smoother negotiation phase. 

Disadvantages

Charging in this way creates problems at the production stage. Machine translation output quality varies, and the LSP doesn’t know in advance how difficult their task will be. Often the improvement in productivity does not correspond to the discount given. Linguists trudge through the machine output, get paid less for very tedious assignments, and end up resenting the LSP and declining new PEMT tasks.

Per hour

On the agency side per-hour approach requires putting in place time trackers and means of evaluating the post-editors productivity, for example, a post-editing analysis step. Rates should be tied to productivity, and without tools it’s hard to say why one editor is completing ten pages an hour, and the other produces only two: because of MT quality, their talent and experience, or due to the level of attention they are paying to details.

Advantages

Removes MT output quality risk at production stage.

Disadvantages

This makes it harder to quantify costs beforehand. The buyer ends up paying for the effort and not the result.

Per word with a per-match discount

The LSP can give a fuzzy match discount for the segments where translation closely matches raw MT output.

Advantages

This is the smartest approach so far. The buyer pays for the result (words), and the LSP can adapt to the highs and the lows of the MT output quality. 

Disadvantages

It becomes difficult to predict leverage, and implementing this is only possible with specialized software.

Note: difference between translation in a CAT-tool and post-editing

In the past, a translator worked with the translation memory, and a post-editor received raw MT output to edit. Today most translations happen with the support of both technologies integrated into CAT-tools and TMS, blending the difference between editing MT and “translating.” In the future, most translations will be assisted with a smart blend of translation memory and MT.

Post-editing may feature actions needed for engine training. For example, this could be human linguists evaluating MT quality, flagging and classifying MT mistakes in addition to correcting them, feeding the corrected translations back into the engine for training purposes and experimenting with data input and output. 

Below are some different models spanning across Classic PEMT, Interactive PEMT, and Adaptive MT. It is very important to know the differences between these, because there are different effort levels (and therefore, different costs) involved with each. When agreeing to a project, it is necessary to clarify up front which post editing model is to be used. 

Post-editing type Description Example:
Classic PEMT MT output is already in the “target” segment, whether it is good or not. https://youtu.be/Abijz71Lz8Y?t=13s
Classic light Correct mistakes only, focus on speed  
Classic deep Bring the robot translation to the human level  
Interactive PEMT Target segment is empty, the linguist can choose to put a MT suggestion in place if it is good, or to translate manually if MT offers gibberish. Indistinguishable from modern translation. https://www.youtube.com/watch?v=_5-SHTFZST4
Adaptive MT MT engine adapts suggestions on the fly based on each input from the translator. The way translation is going. https://www.youtube.com/watch?v=YZ7G3gQgpfI

Summary

Machine translation is improving every day. New developments are constantly rolled out and it is impossible to capture all of the necessary information in a single report. The information we have provided above is a crucial step towards defining your company’s machine translation strategy. 

If you are interested in discussing any of the information in this report with the Nimdzi team, we are happy to talk to you. Remember that, as a Nidmzi Partner, you have full access to our team. Do not hesitate to reach out. You can schedule time directly on our calendar as easily as clicking a button!

Stay up to date as Nimdzi publishes new insights.
We will keep you posted as each new report is published so that you are sure not to miss anything.

  • Sign Up
Lost your password? Please enter your username or email address. You will receive a link to create a new password via email.