Rise of the machines The state of MT - cost, quality, and coverage
Machine translation is perhaps the most rapidly evolving space in the language services industry. It is easy to get lost with new innovations being announced each day. This report will help you stay up to speed without having to research for days.
In this Insight Report, we look at top machine translation providers and how they stack up. Specifically we analyze language coverage, pricing, integrations and of course output quality.
Information contained in this report:
- How LSPs make money
- Market size
- Types of MT
- Ranking quality output
- Language coverage
- CAT Tool integration
Note that in this report we look at output quality for the 11 top machine translation providers for 35 language pairs, both from and into English. To get this information Nimdzi has partnered with Inten.to.
“Intento benchmarks Cognitive Services and provides a single API to use all of them. We help to discover and use the best Artificial Intelligence services for every task without spending human effort on integration and switching providers.”
We know you are busy! That’s why each of our reports are formatted so that information can be quickly and easily digested. For those in a hurry, we provide the TL;DR (Too long; didn’t read!) section in the beginning of each report. At the end of each report we also summarize key points with our own Insights.
For those that think the devil is in the details, we have you covered, too! In the body of the report we go into as much detail as possible about each topic discussed. Still not satisfied? Never hesitate to reach out to us directly to request additional information. We’re here to help!
No time to read the full report? Watch and listen! –
Google Translate processed 146 billion words a day in 2016. That is already three times more than all professional translators in the world together can do in a month. Since then, the scale has only been growing.
For those of you that are worried about machine translation taking over, we are sorry to report that the doomsday scenario has already snuck up on us. The machines have already surpassed us.
Contrary to common expectations, the machines have not stolen all of our jobs. Global demand for language services continues to increase every year. The demand for quality linguists is at record highs. Machine translation (at least in its raw form) does not yet match human translation, and human professional translators are still required for projects that require quality translations.
Technology is not replacing translators. Translators using technology are replacing those who don’t.
However, that doesn’t mean professional translation is completely safe from MT or that professional services providers can ignore MT. With post-editing for most types of content, and raw MT for user-generated content, it is becoming urgent for translation companies and global brands to understand and harness the power of machine translation.
How LSPs make money with MT
Based on our interviews, the clear majority of translation companies see machine translation primarily as a threat. They do not yet generate a lot of new revenue by selling machine translation solutions. Instead, the quickly growing volume of post-editing work is cannibalizing income from professional translations. PEMT often warrants lower margins, partly because the service is new and requires adaptation from LSP vendor management, project management and sales. That’s why established LSPs often try to slow down MT adoption into buyer workflows, citing “client miseducation” about MT.
LSP opportunities with MT are connected with using it internally to improve efficiency and margins, and with making available projects that were not feasible before due to schedule and budgetary constraints.
Most common uses for MT:
Translating user generated content
High-volume product descriptions
How much money is there in MT?
The MT software market is quite small, estimates ranging from USD 130 million to USD 400 million. This is a grain of sand compared to USD 25,000 – 40,000 million in professional translation services.
Industry outsiders typically give MT market a higher valuation, and insiders are more modest. For example, Global Market Insights predicted MT to hit USD 1.5 billion by 2024. Industry insider think tank TAUS made their first estimate in 2015 at USD 250 million, and corrected it down to USD 130 million two years later. They foresee a 6 percent growth rate for the coming years.
The reality is that estimates vary widely and every number we have seen so far is an educated guess. Below are some of the projections that have been published over the last three years.
2015 market sizing estimates
2016 market sizing estimates
2017 market sizing estimates
The reasons to believe MT market is small are legitimate. With the exception of Google, Microsoft and defense contractors such as Raytheon BBN Technologies, most MT providers are microscopic companies. Important players like Omniscien, Kantan MT, Iconic, and Globalese are in the low millions USD in terms of revenue. We estimate SDL’s machine translation software business to be at around USD 6 million. Larger providers with decades-long history such as Systran and Promt never scaled past USD 10 – 15 million.
By all estimates, the money in pure MT is negligible compared to what it does for language services and how it disrupts the industry. However, it should not be viewed in isolation. MT enables post-editing services which could potentially completely transform the professional translation market.
Will there be a boom in MT revenues?
Yes, there will be.
Even without any new advancements into this market, we are already seeing growth being driven by trends in the industry: post-editing, increasing volumes, new markets, new innovation, and new integrations
Switching to post-editing
Demand is skyrocketing
Localizing into more languages
More in-app MT
Leading industries pave the way
In addition to these pre-existing growth drivers, we are seeing even more growth being driven by technological advancement and new entrants to the market. Google’s introduction of Neural machine translation led to an explosive growth in mainstream business media coverage of MT. Buyers became more aware of MT capabilities and their interest has been piqued. After all, the idea is that universal Star Trek translator is around the corner, and that after 60 years of promises the AI has finally cracked the problem of human language. This news functions as a self-fulfilling prophecy.
In 2016 SDL’s machine translation technology has seen a 72 percent increase. SDL’s annual report does not reveal actual numbers, but taking into account the growth rates, we have calculated that they moved from ₤2.8 million to ₤4.5 million (USD 6 million) a year.
In 2017 Amazon introduced Amazon Translate, a service to rival Google Translate. The announcement arrived at the end of the year, but about two years’ worth of preparation and investment went into this system: in 2015 Amazon acquired MT company Safaba Solutions, and later opened a dedicated “MT office” in Pittsburgh.
The next colossus to enter the MT game will be Apple. As of the time of writing this, they were beta-testing an MT service internally, and hiring more scientists in the language technology field for their Munich office.
IT giants such as Amazon and Apple don’t invest resources into products unless they expect to gain a billion dollars in return. So, while other industry think tanks are predicting 6 – 7 percent annual growth, we believe the potential of MT is much larger.
Types of MT
Whether you view machine translation as a threat or an opportunity, it is important to understand how it has evolved over the years. The evolution is not yet complete, of course. Machine translation improves every day. In order to provide context to our conversation, we will first look at the different types of machine translation that have been developed, classified by technology, specialization and usage scenarios.
Classified by technology
First we will look at the types of MT as classified by the type of technology used. Below is a list in order of development, broken down into three generations of MT innovation.
Classified by specialization
When classifying machine translation by specialization, it is useful to look at how the engines are trained and used. Different users will have different content types and, therefore, different requirements. Below are some examples of machine translation providers that can be classified as providing generic, custom, and domain-specific solutions.
|Generic||Doesn’t have specialization, doesn’t follow professional terminology. Equally good for anything.||Google Translate, Microsoft Translator|
|Custom||Uses client-specific terminology and fits their needs best, but requires training and maintenance.||KantanMT, Omniscien, Iconic, Globalese|
|Domain-specific||Ready-made specialized engine. Prioritizes terminology specific to the selected industry (medical, legal, automotive). Good with the chosen type of texts, bad with others.||Promt, Systran, most of custom engines|
Classified by usage scenarios
For a more practical view of how you can use machine translation, it is useful to classify different engines by usage scenario. Below are some examples of how to use specialized machine translation providers for different use-cases that you may have.
|Standalone desktop or mobile app||Promt mobile translator|
|Web portal||Google Translate|
|API||Almost any MT system|
|Backend system||Intel multilingual forum search|
|Integrated into a CAT-tool||XTM + Crosslang, memoQ + Omniscien|
|Integrated into browser||Chrome + Google Translate|
|Integrated into another app||Facebook post translation, Tripadvisor review translation|
|Tied in with hardware||Google Buds, Waver.ly earphones|
Which MT systems offer the best quality
To answer this question Nimdzi Insights has licensed a report from Inten.to comparing translation quality of 11 systems in 35 language combinations. In 2017, Nimdzi, in partnership with Inten.to, ran a series of tests using news as proxy for general business content. DeepL, Google Translate, Yandex, gtcom and Microsoft scored higher that the rest in overall performance. It’s important that a startup company DeepL scores higher than Google Translate – this shows that it’s possible to beat the giant with a quality database, even if it covers only a few languages.
It is very important to note here that the quality was tested with news articles in order to get favorable results for generic engines. Domain engines (ex. automotive) like those used by Promt, Systran, or SAP would score much higher in testing with their respective subject matter area.
The good news though, is that if you have a specific content type that you would like evaluated, this can easily be done. Please contact us to run a specialized test with content of your choice.
Evaluating the quality of MT
MT systems developers test daily to see if new data and algorithms lead to improvements in output quality. Due to the sheer number of tests, human evaluation is usually out of the question, and developers use automated metrics instead.
The most commonly used metric is BLEU. “Bilingual evaluation understudy” shows how closely MT translation corresponds to human. It compares parallel translations and produces a score between 0 (worst) and 1 (best). While BLEU scores are widely used by MT researchers, they can be manipulated, and it takes a specialist to make sense of results. Besides, BLEU favors statistical models over Neural.
BLEU has spawned many derivative models, including METEOR, ROUGE, HyTER, NIST and LEPOR (which is the metric used in the above evaluation). Derivative models attempt to fix some of the drawbacks of BLEU. However, BLEU has the advantage of longevity. Developers using it can compare the performance of their systems all the way back to early stages of MT development, and can clearly estimate the progress, with statements for example, like, “we’ve improved quality by 20 percent over existing systems”.
For actual use it is important to run human evaluation in parallel with machine tests.
MT launched by search engines Google, Yandex and Baidu and Microsoft offer the best language coverage, because through search they have access to the widest selection of parallel texts.
They don’t cover directly all available combinations, and instead translate them via a middle language such as English. This process is called zero-shot translation.
Google Translate and Microsoft Translator are integrated in most CAT tools. Other tools offer a varying set of CAT-tool integrations, with Kantan and Omniscien (ex-Asia Online) at the forefront.
When we talk about pricing for machine translation, we could look at two different aspects of pricing:
- Pricing for the machine translation as a service (ie: raw), and;
- Pricing for post editing of the machine translation
Pricing of raw machine translation
There are several different ways that machine translation providers choose to price machine translation.
Google, Microsoft, IBM and many others sell MT by million characters.
In addition to APIs companies like Promt and Systran offer standalone enterprise translation servers that can be deployed on the client’s hardware and used without per word limit. Server pricing is hidden, and typically starts at USD 15,000 – USD 30,000 per project. Purchasing servers makes sense in the following scenarios:
- High volume, preferably > 2,000,000 characters a year
- High security requirements – machine translating confidential information that should not leave the premises
These companies also offer engine training services that are priced per hour.
Custom machine translation vendors such as KantanMT and Omniscien offer browser suites to train MT engines and connect them to various environments. Customers can log in, upload their translation memory and add it to an existing baseline engine to train custom MT. The custom engine can then function separately, or be plugged into a TMS. Web solutions come on a subscription basis, and their price starts at USD 1,500 – 3,000 a month.
Pricing of post editing
Machine translation post-editing (PEMT) rates are not a standard etched in stone, and many translation companies struggle correctly evaluating the internal cost and offering a respective price to the client. There are three main approaches to pricing PEMT.
Per word translation costs
The simplest and most common way to charge for machine translation post editing would be on a per word basis. Often this is represented as a percentage of the standard human translation rate, such as 20 – 30 percent of the translation rate per word.
On the agency side per-hour approach requires putting in place time trackers and means of evaluating the post-editors productivity, for example, a post-editing analysis step. Rates should be tied to productivity, and without tools it’s hard to say why one editor is completing ten pages an hour, and the other produces only two: because of MT quality, their talent and experience, or due to the level of attention they are paying to details.
Per word with a per-match discount
The LSP can give a fuzzy match discount for the segments where translation closely matches raw MT output.
Note: difference between translation in a CAT-tool and post-editing
In the past, a translator worked with the translation memory, and a post-editor received raw MT output to edit. Today most translations happen with the support of both technologies integrated into CAT-tools and TMS, blending the difference between editing MT and “translating.” In the future, most translations will be assisted with a smart blend of translation memory and MT.
Post-editing may feature actions needed for engine training. For example, this could be human linguists evaluating MT quality, flagging and classifying MT mistakes in addition to correcting them, feeding the corrected translations back into the engine for training purposes and experimenting with data input and output.
Below are some different models spanning across Classic PEMT, Interactive PEMT, and Adaptive MT. It is very important to know the differences between these, because there are different effort levels (and therefore, different costs) involved with each. When agreeing to a project, it is necessary to clarify up front which post editing model is to be used.
|Classic PEMT||MT output is already in the “target” segment, whether it is good or not.||https://youtu.be/Abijz71Lz8Y?t=13s|
|Classic light||Correct mistakes only, focus on speed|
|Classic deep||Bring the robot translation to the human level|
|Interactive PEMT||Target segment is empty, the linguist can choose to put a MT suggestion in place if it is good, or to translate manually if MT offers gibberish. Indistinguishable from modern translation.||https://www.youtube.com/watch?v=_5-SHTFZST4|
|Adaptive MT||MT engine adapts suggestions on the fly based on each input from the translator. The way translation is going.||https://www.youtube.com/watch?v=YZ7G3gQgpfI|
Machine translation is improving every day. New developments are constantly rolled out and it is impossible to capture all of the necessary information in a single report. The information we have provided above is a crucial step towards defining your company’s machine translation strategy.
If you are interested in discussing any of the information in this report with the Nimdzi team, we are happy to talk to you. Remember that, as a Nidmzi Partner, you have full access to our team. Do not hesitate to reach out. You can schedule time directly on our calendar as easily as clicking a button!