fbpx

THE 2024 NIMDZI 100

Technology

In this section, we highlight several key technology trends and driving forces that stood out from our analysis and predict how these will shape the market for language services in the coming years.

Generative AI is the most discussed driver of change

2023 was the first full year of GenAI, kicked off to the public with ChatGPT’s release in November 2022. Since then, a cavalcade of alternative LLM platforms – both proprietary ones from big tech companies and open-source models – have been announced, and there is no slowing down. The incredibly rapid development in and the hype around LLMs implies that all previous technology trends and predictions are annulled.

Last year was all about testing to evaluate LLMs' utility, quality, and implementation strategies. The experiments can be categorized as a) augmenting, b) upgrading, or c) expanding value. Augmentation involves integrating the new AI models to enhance existing technologies and workflows (such as machine translation quality estimation or automated post-editing), upgrades entail replacing prevailing technological frameworks with LLMs (such as utilizing them for machine translation or optical character recognition), and expansion of value creation emerges when LLMs are employed to generate innovative benefits that were previously unattainable or impractical with technology (such as source optimization, large-scale translation memory cleanups, multilingual content generation, or linguistic bug detection), extending the range of services LSPs can offer, their reach to new potential buyers, and the competitive landscape they need to cope with. In addition, AI may prove to be more useful as a platform for the orchestration of traditional tools in the existing and evolving language technology ecosystem. 

The prevailing sentiment from almost all stakeholders is that piloting LLMs is relatively easy and has considerable potential. However, harvesting wide-scale benefits is more challenging than initially anticipated. The term “wide-scale” is key here: driven by demand from senior management levels, enterprises try to leverage GenAI across their departments, and content and language services is just one of those many functions. Experiments show that company-wide AI initiatives are no less – if not more – complex than digital transformations, which itself is not a completed journey for many organizations. 

Nevertheless, the great promise of LLMs is their versatility in various natural language tasks, even if their performance may – yet – lag behind purpose-built tools or human expertise. Text summarization, rewriting, tone corrections, question answering, retrieval augmented generation (RAG), poem writing, and many more functions are suddenly made available in a single tool, even in multiple languages. The latest versions of platforms, such as OpenAI’s GPT-4 or Google’s Gemini, even offer multimodal operations, voice and image inputs, and the race is still hot between big tech and AI labs for the next big thing, video generation included.

All that said, we haven’t seen any actual mass application of LLMs within the industry. Some of the key factors contributing to this are the rapid pace of development and the proliferation of platforms. Similarly as with machine translation, LLMs do not represent a singular, unified technology, and there is no single clear-cut platform that would comprehensively serve all the use cases – and languages – in the most efficient way. Also, the largest and most capable general-purpose LLMs have very high computational requirements, high latency, hallucination issues, buggy prompt engineering, and often unverified quality outcomes. As enterprises and LSPs have over time made significant investments into their language technology stacks, the current challenges with LLMs – although temptingly capable looking – make them a dubious option to solve old problems in a new way.

In addition, the speed of advancement in the foundational models of GenAI – especially spurred by big tech – warrants some caution when making strategic, long-term decisions for any of the current LLM solutions. The prospect of domain- and task-specific, fine-tuned small language models is also on the horizon. The threat (and opportunity) of a new disrupting architecture, modality, or provider emerging in a short timeframe would render carefully considered ROI projections obsolete.

We foresee that 2024 will be about solving the complexities of operationalizing GenAI in the industry.

This entails overcoming challenges related to data governance, repeatability, scalability, integration with legacy systems, and talent upskilling related to AI deployment. LSPs and buyer-side localization programs sit on a large pool of language and content data that will prove invaluable in the deeper adoption of LLMs. Consequently, 2024 is going to witness concerted efforts both on the buyer and provider side towards streamlining AI implementation and creating tangible benefits amidst these complexities.

GenAI and LLMs - impact on the buyer side

Undoubtedly, AI has already had an impact on the general demand for language services, but this is still in its early stages. Especially in the second half of 2023, many clients “paused” to think and rethink their international content strategies, with little visible outcome so far. AI is viewed as a feature enhancement rather than a job replacement, a way to gain further efficiencies in existing workflows rather than a radically new way of creating and delivering content. 

Buyers feel pressured to adopt large language models like ChatGPT for innovation signaling, though AI has not yet matched the depth of nuanced human understanding. The industry is grappling with the integration of AI into operations, impacting the traditional role of localization managers.

Nevertheless, the promise of and hype around LLMs is strong enough to gauge the interest of top management on the buyer side and the pressure for implementation. For instance, generating multilingual content without intermediaries is a tantalizing opportunity to disrupt the traditional create-translate-publish cycle. Buyer-side localization managers face the challenge of pushing back on this pressure, while the AI revolution within the language industry has already happened, starting with the advent of NMT more than six years ago.

GenAI and LLMs – what providers are up to

Tech-enabled LSPs and technology providers (LTPs) are keen to demonstrate their expertise with AI to customers and the general market. This is fueled by both the fear of being left behind and the actual existing experience and expertise the industry has with AI – acknowledging that NMT is in fact AI-based. As a result, many companies in the industry are experimenting with generative AI (GenAI), but tangible results are limited.

MT aggregators such as Intento and Custom.MT added OpenAI’s models to their MT portfolio. Extensive testing on the MT capabilities of LLMs is ongoing by both tech and service providers, hindered by the fact that there is no single, industry-wide accepted set of benchmarks and metrics that would be decisive in this question. As a leader in conducting these benchmarks, Intento’s relevant reports are quite insightful, even if limited to a small number of language pairs. 

Translation management system (TMS) providers rushed to add OpenAI-based features to their product portfolio, which could potentially benefit translation memory, terminology, and quality management, but we haven’t witnessed any significant impact on the client business yet. For now, the results seem to be many tests and demos, but for tomorrow, buyers expect TMS to integrate LLMs as the go-to-engines powering localization programs. At the same time, many technologically mature buyers run their own LLM experiments, aiming at use cases such as large-scale TM cleanup, source content optimization, and cultural adaptation.

2024 promises to be the year of the GenAI and LLM breakthrough in the TMS market as well. While we anticipate that 2024 will see more and deeper integrations, there were many early adoptions in 2023, including:

  • Crowdin and Trados launched AI assistant copilots.
  • Lokalise and others added their LLM-powered AI translation solution to their platform.
  • memoQ, Bureau Works, and Smartling blended GPT with translation engines, similar to Phrase and RWS Trados.
  • XTM and others have been working on enhancing TMs with LLMs.
  • LILT added a full-on LLM studio and a content creation tool, while their adaptive MT engine is LLM-driven already.
  • Smartling added LLMs from Google and Amazon Bedrock to their MT portfolio, with mixed results, stating that “translation smoothing” is one of the highest success stories of LLMs.
  • Translated, a leader in the adaptive MT space with its ModernMT, also enhanced their solution with LLMs.
  • Unbabel developed and launched TowerLLM, a translation-focused implementation of Meta’s Llama2 model, now available in 2 sizes.

In addition, some LSPs have made focused efforts in the data-for-AI space. The leaders in the space are Transperfect, RWS, Centific, Summa Linguae, e2f, and – despite their recent losses – Appen. This service line has become so important for some companies that they created a separate brand for the tech and services – namely, RWS with TrainAI and Centific with Honeybee. These companies rely on vast pools of freelancers for collecting, validating, and annotating data for AI, as well as for internal AI expertise for consulting or model hosting and fine-tuning in the LLM space. 

Based on our discussion with providers, we see two distinct trends emerging in the field of data-for-AI:

  1. Requests for service are becoming more focused on high-value expert validation instead of low-cost data gathering. This might be one of the reasons for Appen's recent downfall, an inertia towards pivoting to this area.
  2. Next to big tech and AI labs, demand is emerging from enterprises for expert, in-domain model re-training or fine-tuning that entails a rising need for language-related AI data services from new potential buyers.

LSP’s perception of generative AI in the industry

Many conversations we have had with LSPs throughout the year were underscored by a sense of fear, uncertainty, and doubt, primarily driven by the hype-driven perception that GenAI is a real revolution for the industry. The 2024 Nimdzi 100 survey results reveal a completely different picture, where the actual impact of generative AI was slightly positive last year: only one out of five respondents experienced a negative effect, while the same number of LSPs cited neutral and positive impact. The outlook is even more optimistic: the vast majority of the 16 percentage points decrease in neutral responses voted for positive impact this year, which means:

More than half of LSPs expect that generative AI will be beneficial to their business in 2024. 

How has generative AI impacted your business, and how do you expect generative AI to impact your business in 2024?

These expectations are not based on passive acceptance. While 42% of respondents reported that they are still cautiously waiting to see how the GenAI story unfolds, more than half of LSPs in our survey already took to action. Training internal staff in the usage of these new tools tops the actions list (61% of respondents), and adapting internal processes follows closely with 58%. 57% of respondents actively talk to their clients about implementing generative AI, and 52% reported that they have launched new related solutions and services. Optimism is also shown in staffing actions: 20% of LSPs that responded hired new staff due to the new technologies, while only 7% had to reduce their headcount to mitigate risks.

What actions have you taken as a result of the recent advancements in generative AI?

When it comes to the actual use of LLMs, the vast majority (67%) of respondents prefer out-of-the-box solutions, just above half (51%) integrate LLMs via APIs into their workflows, and 44% rely on their technology providers to gain access to the new technologies. As the technology is still in its infancy with rapid developments, we look forward to the evolution of these usage scenarios in the future.

How are you using generative AI?

As machine translation engines have become a reliable staple piece in the LSP toolbox, and almost all conversations are about the new generative AI, progress is made in other areas of language technology. Multimedia and audio-visual technologies made a lot of progress, even though they are yet to reach a full-scale production level. Last year, we wrote about live subtitling, and we can say it has entered wider adoption by now. Machine interpreting (MI) is rising and has new solutions on the horizon. AI dubbing is still a highly active topic, and further advancements are expected in automation and orchestration. 

Machine interpreting – on the rise

Last year, we prophesied that AI-driven machine interpreting (MI) would gain momentum, and indeed, it did. It is a hot topic in the industry, spurring spiced conversations, viral videos, and the SAFE AI Task Force. There is no doubt that the technology is picking up pace, and though implementations are not widely publicized, growth is expected to be strong. Nimdzi’s research indicates that MI keeps finding and opening new markets instead of providing direct competition to human interpreting – at least for the time being.

The well-known solutions for MI (also called speech-to-speech translation, S2ST) include both devices (such as Timekettle’s and Waverly Lab’s products or simple Google Pixel Buds) and software applications. Software applications have the benefit that they can be used within video conferencing interfaces, i.e., users can pull them into a Meet, Teams, or Zoom meeting. Our MI Evolution Matrix considers Kudo, Wordly, and Interprefy as leaders in this space. They primarily target B2B online conferences and events with their solutions. 

Demand from large enterprises for these technologies has been increasing since COVID times, as they provide a viable alternative to multinational companies that want to respect the multilingualism of their staff. For instance, MI allows a single trainer to host just one training session for a large audience with speakers of diverse languages instead of organizing individual sessions with different trainers for the various languages.

That said, experts in the field generally agree that MI is not yet accepted for deployment in more critical and regulated market segments, such as healthcare, legal, or government interpreting, as customers often quote accuracy and liability challenges as reasons. However, barriers to entry into these verticals may be more perceptional than technological with the recent developments, and we will likely witness sector-specific implementations of MI in 2024.

Hospitals and courts will likely become a new testing ground for MI, and operational delivery will probably come from an amalgamation of funds and technical expertise. Laggards will fade quickly into obsolescence, and players with enough capital to play the long game may eventually merge or be acquired by the bigger platforms. Although giants may be dominating the monolingual meeting space, their lack of knowledge of multilingual interpretation may be easier to buy than build.

In the meantime, big tech’s AI labs are moving forward with new foundational architectures. While “traditional” MI involves sequential transcription, machine translation, and text-to-speech for voice output, direct S2ST – that sidesteps the whole cascade and translates directly from speech to speech model – is also on the horizon. The prime examples of these efforts come under the Seamless umbrella name from Meta (which aims to create technology to power a language-agnostic future metaverse) and from Google as AudioPaLM (an interesting multimodal solution that fuzes PaLM 2 and AudioLM) and the Translatotron (that reached its third version in December 2023). These multilingual models are primarily for research purposes and showcasing R&D prowess, and they are not for commercial use. Nevertheless, they indicate a future direction of machine interpreting technologies.

AI dubbing

Dubbing is one of the bread-and-butter services in the media localization industry and, to date, is (almost) exclusively performed by voice actors. However, the latest developments in AI dubbing are changing the landscape. Synthetic voice technology has greatly improved to the point where it is indistinguishable from human voice. Voice cloning, where a speaker's voice – including intonation, rhythm, and mannerisms – is cloned and replicated from a few seconds of audio sample, enables deceptively realistic AI dubbing. Spotify’s use of AI dubbing for select podcasters is one of the many widely known applications of the technology. In addition, lip-synching in dubbed videos became a possibility. Although the technology is not enterprise-grade yet and latency is a big question, live videos of lip-synched voice-cloned AI dubbed events, such as from HeyGen, demonstrate the future.

Companies like Synthesia, Voiseed, Translated, and Verbalate are at the forefront of AI dubbing, along with a range of new kids on the block with the word “dub” in the product or company name. The buzz is well-merited as demand has been growing for technology and service alike. Major media companies are all working with their technology and localization partners to implement AI dubbing in production, bearing in mind the legal and intellectual property rights challenges of voice cloning, which are not dissimilar to those that spurred the recent Hollywood strikes. At the time of publication, trade union SAG-AFTRA is voting on updates to Television Animation Agreements to include an acknowledgment that “voice actors” are human and confirmation that voice-cloned actors “shall be eligible for residuals” as benefits from the foreign language distribution.

Orchestration

The language industry is fragmented not only on the service provider side but also in the number of technologies used (see our Language Technology Atlas), especially as language tech has to connect to various clients’ content, document, and website management systems. Integrators and connectors are key ecosystem components in achieving efficient multilingual content pipelines. These ensure the flow of content from creation to localization and publishing by creating the pipelines in the system. However, with the proliferation of platforms across this ecosystem (CMS, TMS, MT, and other solutions playing their part), these pipelines are ever-branching, looping, and reconverging over an ideally centralized set of linguistic assets – style guides, glossaries, translation memories, etc. – depending on the content type, source, workflow, and destination. Automations that route the content into workflows are often scripts, macros, and code snippets created by localization engineers. 

Recently, no-code and low-code solutions tailored to the language industry have been popping up to solve these workflow and routing problems, promising to reduce manual and coding effort. The most prevalent players in this orchestration field are Blackbird and Phrase. Blackbird, a content integration platform for the language industry, offers Bird Editor a visual no-code workflow editor for a wide range of use cases for building business logic, such as MT engine routing, trigger-based workflow selection, or automated communication across the tools users want to integrate. Phrase, primarily a TMS provider, created the Orchestrator solution that caters to their customers within Phrase Localization Suite.

While automation and integrations have always been at the heart of every efficient localization program, they typically require custom development. The new no-code solutions cater both to customers looking for ease of use in an out-of-the-box solution and, through code injection, to those who prefer to closely check and configure their automation. We’re excited to see how these and other new no-code platforms develop in the age where GenAI promises to democratize coding via copilots and a natural language interface.




Send this to a friend