Report written by Rosemary Hynes.
Machine interpreting (MI) is a hot topic right now as technology providers boast their latest advances in this field. It is likely that the advent of MI will revolutionize the interpreting industry as we know it, similarly to how machine translation (MT) upended the translation industry and ushered in a new era for all stakeholders involved. So, now is the perfect opportunity to take a deep dive into the world of machine interpreting.
First things first, what is MI? It is the transmission of a spoken message in one language into a spoken message in a different language using artificial intelligence (AI), without the input of a human interpreter. With MI, interactions between people who speak two different languages can be facilitated by technology bridging the language barrier. For example, an English-speaking patient could speak to their doctor who speaks Mandarin using just their smartphone to overcome the linguistic differences. What we might deem the first attempt to build a speech-to-speech translation (S2ST) system was carried out in 2016 by a group of French researchers in a proof of concept paper that can be read here. The team found that they were able to obtain quite promising results from their MI engine using only a small corpus. They concluded with a recommendation of training future MI engines on more varied datasets such as TED talks or audiobooks available in the public domain.
How, then, does MI work? S2ST technology uses automatic speech recognition (ASR) to identify the source language and uses AI and a synthetic voice to speak the message in the target language. As it stands, current MI uses a cascade model. Like a waterfall of information flowing from one step to another, this model transfers language through a series of steps in order to reach the final product: a synthetic voice producing the speaker’s message in a different language from the original.
Language technology providers are scrambling to jump on the speech-to-text bandwagon which means users can view machine-generated live subtitles (translated from the original) as well as multilingual captions (monolingual transcripts available for different languages)of speeches in their preferred language.
This report is the first in an ongoing Business Confidence Study series that Nimdzi is kicking off to keep a pulse on the industry.
Cologne-based DeepL has announced the beta launch of DeepL Write, an AI-powered authoring tool intended to improve texts by fixing errors and making suggestions for word replacements while keeping an eye on style, grammar and formatting.