New challenges brought about by doing business in our digital world demand new solutions. Some constants still remain, however, without which a text and the quality of its translation would be less than satisfactory. One good example of such a constant is terminology and terminology management.
Terminology management includes a number of different aspects, but it usually starts with terminology extraction. As we wrote in 2018, if there’s no glossary, the first task is terminology mining (or, terminology harvesting or gathering).
‘Term Extraction’ is understood as the formation of a list of terms, the translation of which should be consistent within the framework of a project. The result of extracting terminology is a list of terms with contexts listed in the glossary. Some extraction tools provide statistical solutions for gathering a list of terms for which translations do not yet exist. The translations are then created either in the course of a project, or as a separate process, by delegating this task, for example, to a terminologist.
To speed up the process, some tools offer extraction of bilingual terms from reference files and from previous translations. SynchroTerm (part of the Terminotix Solution by LogiTerm), for example, automatically extracts terms, their equivalents, and contexts from file pairs in any format, bitexts, SDLXLIFF, XLIFF, or TMX files.
Most terminology management systems (TBS) feature term extraction functionality, but some rely on third-party extraction tools like MultiTerm Extract. The same situation is observed with translation management systems (TMS). This means that in a regular translation workflow inside a TMS, a linguist would probably use third-party statistical terminology extractors. However, there are TMS that offer built-in options for this process. You can have a look at some examples of such TMS by selecting the “terminology extraction” filter on the Nimdzi TMS feature overview page.
Four examples of mainstream TMS with term extraction capabilities. Source: Nimdzi TMS feature overview tool.
Terminology management is an essential step in any successful translation project workflow — and productivity norms to measure it have been evolving. Earlier in 2020, we published a post about productivity in terminology management. It garnered attention from academic circles, representatives of which pointed out that the productivity metric used for the translation of a term should be less than that widely used within the localization industry.
Indeed, in some cases, five seconds for a terminologist to decide on a term candidate may be unrealistic and an hour may not be enough to translate 50 terms into one target language. In other instances, though, even higher productivity rates of constructing terminology lists are already being successfully achieved. For instance, Omniscien offers a solution with productivity already three times higher — their terminology extraction of subtitles and automatic terminology translation presents options to the user who then votes for the best suggestion. Of course, the machine may or may not be wrong, but, according to Omniscien, this scheme helps achieve a translation productivity rate of 180 terms per hour.
Another milestone in bilingual terminology extraction has been recently set by XTM. Their newly developed feature available in XTM v12.4 and later helps build terminology lists from existing translations with up to 90 percent accuracy.
Source: Process Innovation Challenge, Locworld
“XTM is an innovative company, more so than many other TMS providers. It invests in linguistic intelligence. Innovation is not something you can put amongst TMS requirements, but if you were to do so, then XTM would score very well.”
István Lengyel, Belazy Ltd.
For their automatic extraction of bilingual terminology, XTM utilizes Big Data, AI, and advances in computational linguistic technology including Inter-language vector space. The feature already works for 50 languages helping XTM customers save up to 80 percent of time on glossary creation.
“The XTM AI team has developed a new technology to take a mundane and tedious process away from the terminologist. The bilingual term extraction performed during the alignment of the parallel source and target texts produces a spreadsheet with the data required to review and add terminology. One implication of this is that XTM users will see 80% productivity improvement over manual methods.”
Sara Basile, XTM International
XTM sells both to enterprises and language service providers (LSPs). This presents an opportunity for many different localization industry players to try this promising automated approach which makes smart choices and helps tackle the challenge of aligning and extracting terminology in an efficient and innovative way.
The Nimdzi Language Technology Atlas maps over 800 different technology solutions across a number of key product categories. The report highlights trends and things to watch out for. This is the only map you will ever need to navigate your way across the language technology landscape.