fbpx

Machine Translation: Productivity as a Function of Quality

Nimdzi Finger Food is the bite-sized and free to sample insight you need to fuel your decision-making today.

If you want to learn more from our experts about language technology available today, contact us today.

We recently introduced you to the two- (or five-) second rule, which is essentially the reaction or decision-making time a linguist should spend judging whether to post-edit a segment of machine translation (MT) output or to retranslate it.

This rule of thumb aims to help increase the linguist’s productivity when working with MT.

Another way of looking at the task of increasing productivity is through an MT auto-select feature we described here last year. It’s an approach that’s available in tools such as Memsource Translate, which also includes a Machine Translation Quality Estimation (MTQE) feature. MTQE helps users evaluate the quality of the MT output: scores are automatically calculated before any post-editing (PE) is done and appear at the segment level together with other translation resources (e.g., the translation memory). 

These are the four MTQE scoring categories:

ScoreCategory
100%Excellent MT match, probably no PE required
99%Near-perfect MT output, possibly minor PE required mostly for typographical errors
75% Good MT match, but likely to require PE
No scoreWhen there is no score, it’s very likely that the MT output is of low quality; it is recommended that this output be used for reference only

MTQE could help limit or even eliminate the two-second MTPE rule. Using a built-in feature like this means linguists no longer need to make a decision whether or not to post-edit: the machine does it for them. Whenever a score of 75% or above is predicted by MTQE, the corresponding segment would be a candidate to start post-editing right away. When no score is predicted, the MT output can be discarded. 

So what exactly does all this tell us about how MT quality and post-editor productivity are correlated? Let’s have a look at the chart below that shows how MTPE productivity (in words/hours) changes with increasing quality of MT.

Source: Memsource. Productivity for EN>DE

On the X axis, 100 means “perfect” MT: no post-editing needed. Productivity is plotted against productivity when translating from scratch (the flat lines). There are two productivity lines. They differ in the corresponding segment length (3 to 8 words, 9 to 26 words). The absolute productivity numbers may be a bit higher. But what’s important here is the observed trend: post-editing of low-quality MT is less productive than translating from scratch, and higher-quality MT increases productivity considerably.

As noted in a LocWorld39 presentation, this is how Memsource measures performance (conceptually, the method is still the same):

  • The X axis is the true PE score of an MT
  • The Y axis is a number of segments from a testing data set
  • The colors are MTQE predictions

Source: Memsource. LocWorld 39, “Quality Estimation in the AI Era”

At the far right are perfect MT outputs. And MTQE identified most of them as perfect (green bar), some as 75 (good MT, orange bar), and a very small amount as bad (blue bar). 

This experiment proves that the post-editor’s productivity is a function of the MT output quality. MTQE is an example of how automated quality estimation can be incorporated and leveraged in the localization workflow to benefit productivity.

Nimdzi Finger Food is the bite-sized and free to sample insight you need to fuel your decision-making today.

If you want to learn more from our experts about language technology available today, contact us today.

Stay up to date as Nimdzi publishes new insights.
We will keep you posted as each new report is published so that you are sure not to miss anything.

Related posts