Term Extraction: XTM Blazes a Trail with New Norms

Nimdzi Finger Food is the bite-sized and free to sample insight you need to fuel your decision-making today.

If you want to learn more from our experts about language technology available today, contact us today.

New challenges brought about by doing business in our digital world demand new solutions. Some constants still remain, however, without which a text and the quality of its translation would be less than satisfactory. One good example of such a constant is terminology and terminology management.

Terminology management includes a number of different aspects, but it usually starts with terminology extraction. As we wrote in 2018, if there’s no glossary, the first task is terminology mining (or, terminology harvesting or gathering).

What is term extraction?

Term Extraction’ is understood as the formation of a list of terms, the translation of which should be consistent within the framework of a project. The result of extracting terminology is a list of terms with contexts listed in the glossary. Some extraction tools provide statistical solutions for gathering a list of terms for which translations do not yet exist. The translations are then created either in the course of a project, or as a separate process, by delegating this task, for example, to a terminologist.

Bilingual Term Extraction Software

To speed up the process, some tools offer extraction of bilingual terms from reference files and from previous translations. SynchroTerm (part of the Terminotix Solution by LogiTerm), for example, automatically extracts terms, their equivalents, and contexts from file pairs in any format, bitexts, SDLXLIFF, XLIFF, or TMX files.

Most terminology management systems (TBS) feature term extraction functionality, but some rely on third-party extraction tools like MultiTerm Extract. The same situation is observed with translation management systems (TMS). This means that in a regular translation workflow inside a TMS, a linguist would probably use third-party statistical terminology extractors. However, there are TMS that offer built-in options for this process. You can have a look at some examples of such TMS by selecting the  “terminology extraction” filter on the Nimdzi TMS feature overview page.

Translation management systems with term extraction functionality Nimdzi TMS feature overview tool

Four examples of mainstream TMS with term extraction capabilities. Source: Nimdzi TMS feature overview tool.

Bilingual terminology extraction productivity, rethought

Terminology management is an essential step in any successful translation project workflow — and productivity norms to measure it have been evolving. Earlier in 2020, we published a post about productivity in terminology management. It garnered attention from academic circles, representatives of which pointed out that the productivity metric used for the translation of a term should be less than that widely used within the localization industry.

Indeed, in some cases, five seconds for a terminologist to decide on a term candidate may be unrealistic and an hour may not be enough to translate 50 terms into one target language. In other instances, though, even higher productivity rates of constructing terminology lists are already being successfully achieved. For instance, Omniscien offers a solution with productivity already three times higher — their terminology extraction of subtitles and automatic terminology translation presents options to the user who then votes for the best suggestion. Of course, the machine may or may not be wrong, but, according to Omniscien, this scheme helps achieve a translation productivity rate of 180 terms per hour.

Another milestone in bilingual terminology extraction has been recently set by XTM. Their newly developed feature available in XTM v12.4 and later helps build terminology lists from existing translations with up to 90 percent accuracy.

Bilingual term extraction by XTM

Source: Process Innovation Challenge, Locworld

Term extraction productivity gains: XTM sets the bar

“XTM is an innovative company, more so than many other TMS providers. It invests in linguistic intelligence. Innovation is not something you can put amongst TMS requirements, but if you were to do so, then XTM would score very well.”

István Lengyel, Belazy Ltd.

For their automatic extraction of bilingual terminology, XTM utilizes Big Data, AI, and advances in computational linguistic technology including Inter-language vector space. The feature already works for 50 languages helping XTM customers save up to 80 percent of time on glossary creation.

“The XTM AI team has developed a new technology to take a mundane and tedious process away from the terminologist. The bilingual term extraction performed during the alignment of the parallel source and target texts produces a spreadsheet with the data required to review and add terminology. One implication of this is that XTM users will see 80% productivity improvement over manual methods.”

Sara Basile, XTM International

XTM sells both to enterprises and language service providers (LSPs). This presents an opportunity for many different localization industry players to try this promising automated approach which makes smart choices and helps tackle the challenge of aligning and extracting terminology in an efficient and innovative way.

Nimdzi Finger Food is the bite-sized and free to sample insight you need to fuel your decision-making today.

If you want to learn more from our experts about language technology available today, contact us.
21 November 2020

Stay up to date as Nimdzi publishes new insights.
We will keep you posted as each new report is published so that you are sure not to miss anything.

Related posts