New challenges brought about by doing business in our digital world demand new solutions. Some constants still remain, however, without which a text and the quality of its translation would be less than satisfactory. One good example of such a constant is terminology and terminology management.
Terminology management includes a number of different aspects, but it usually starts with terminology extraction. As we wrote in 2018, if there’s no glossary, the first task is terminology mining (or, terminology harvesting or gathering).
‘Term Extraction’ is understood as the formation of a list of terms, the translation of which should be consistent within the framework of a project. The result of extracting terminology is a list of terms with contexts listed in the glossary. Some extraction tools provide statistical solutions for gathering a list of terms for which translations do not yet exist. The translations are then created either in the course of a project, or as a separate process, by delegating this task, for example, to a terminologist.
To speed up the process, some tools offer extraction of bilingual terms from reference files and from previous translations. SynchroTerm (part of the Terminotix Solution by LogiTerm), for example, automatically extracts terms, their equivalents, and contexts from file pairs in any format, bitexts, SDLXLIFF, XLIFF, or TMX files.
Most terminology management systems (TBS) feature term extraction functionality, but some rely on third-party extraction tools like MultiTerm Extract. The same situation is observed with translation management systems (TMS). This means that in a regular translation workflow inside a TMS, a linguist would probably use third-party statistical terminology extractors. However, there are TMS that offer built-in options for this process. You can have a look at some examples of such TMS by selecting the “terminology extraction” filter on the Nimdzi TMS feature overview page.
Four examples of mainstream TMS with term extraction capabilities. Source: Nimdzi TMS feature overview tool.
Terminology management is an essential step in any successful translation project workflow — and productivity norms to measure it have been evolving. Earlier in 2020, we published a post about productivity in terminology management. It garnered attention from academic circles, representatives of which pointed out that the productivity metric used for the translation of a term should be less than that widely used within the localization industry.
Indeed, in some cases, five seconds for a terminologist to decide on a term candidate may be unrealistic and an hour may not be enough to translate 50 terms into one target language. In other instances, though, even higher productivity rates of constructing terminology lists are already being successfully achieved. For instance, Omniscien offers a solution with productivity already three times higher — their terminology extraction of subtitles and automatic terminology translation presents options to the user who then votes for the best suggestion. Of course, the machine may or may not be wrong, but, according to Omniscien, this scheme helps achieve a translation productivity rate of 180 terms per hour.
Another milestone in bilingual terminology extraction has been recently set by XTM. Their newly developed feature available in XTM v12.4 and later helps build terminology lists from existing translations with up to 90 percent accuracy.
Source: Process Innovation Challenge, Locworld
“XTM is an innovative company, more so than many other TMS providers. It invests in linguistic intelligence. Innovation is not something you can put amongst TMS requirements, but if you were to do so, then XTM would score very well.”
István Lengyel, Belazy Ltd.
For their automatic extraction of bilingual terminology, XTM utilizes Big Data, AI, and advances in computational linguistic technology including Inter-language vector space. The feature already works for 50 languages helping XTM customers save up to 80 percent of time on glossary creation.
“The XTM AI team has developed a new technology to take a mundane and tedious process away from the terminologist. The bilingual term extraction performed during the alignment of the parallel source and target texts produces a spreadsheet with the data required to review and add terminology. One implication of this is that XTM users will see 80% productivity improvement over manual methods.”
XTM sells both to enterprises and language service providers (LSPs). This presents an opportunity for many different localization industry players to try this promising automated approach which makes smart choices and helps tackle the challenge of aligning and extracting terminology in an efficient and innovative way.
Domo is a cloud-native platform that provides data integration and visualization capabilities, as well as a foundation to create custom apps for tracking key business metrics. The company was founded in 2010 and serves the technology, manufacturing, media and entertainment, and other industries.
VMware is a global leader in cloud infrastructure & digital workspace technology, accelerating digital transformation for evolving IT environments. VMware’s compute, cloud, mobility, networking, and security offerings form a digital foundation that powers the apps, services, and experiences that are transforming the world. For this Lesson in Localization, we spoke with Clara Macedo, Senior Manager of LATAM Localization Operations and Head of Marketing Globalization PMO, and Zhenhui Chao, Localization Manager at VMware.
We recently introduced you to the two- (or five-) second rule, which is essentially the reaction or decision-making time a linguist should spend judging whether to post-edit a segment of machine translation (MT) output or to retranslate it. This rule of thumb aims to help increase the linguist’s productivity when working with MT.
Going global (or at least seriously considering the option) is the natural next step in the evolution of any business. Today, it isn’t even that hard compared to the olden days, and we’re in the midst of a technological revolution that’s abruptly changed our content consumption habits. It’s now easier than ever to translate applications, localize websites, and make dubbed TV series available to millions of viewers.