USD 500 million in data services and machine intelligence work. This is what Appen, Lionbridge, Pactera, Welocalize, Alibaba LS, and other techy language service providers (LSPs) will likely generate in revenue this year. That’s close to 7 percent of revenue generated by the global Top-100 LSPs, and it’s on par with the volume they derive from Legal, which is a classic translation vertical. Furthermore, Legal does not promise explosive growth, while experts believe AI training will triple in five years.
Can you feel the envy of large multi-language vendors (MLVs) who don’t have data services business units? The gnawing fear of missing out is real and is further exacerbated when they see their competitors rebrand and change their domains from old-fashioned .com to the sexy and novel .ai, or to enigmatic and vague .io.
As the work trickles from MLVs down the supply chain, smaller LSPs scratch their heads when they receive queries for dialogue enrichment, annotation, tagging, article categorization, and sound recording.
The question on everyone’s mind is: should I be doing this, too? Should I restructure my well optimized LSP and chase the spectre of AI? Data services are outsourcing, and outsourcing is something that LSPs are good at. In five years, there is no guarantee that non-AI MLVs will even remain relevant, even if today AI is just a catch-phrase.
Translation is driven by regulated industries where buyers have no other choice but to translate. Boeing has to provide multilingual technical documentation to export their planes. Pfizer needs to run clinical trials in 30+ local languages to put drugs in European markets. In order to build a power plant, Rosatom files millions of words of technical documentation with the local authority first. Sony lawyers file for patent protection in multiple jurisdictions all the time. Even if translation is automated on the vendor side, buyers don’t have the option of following the Ikea example of getting rid of text. Translations will definitely be there for the next decade.
It’s a different story with AI projects. It takes a lot of human work to train a model, but the maintenance part might not generate as much revenue for the LSP. A project that brings USD 250,000 in the first year might become 50,000 in the second year. Furthermore, a buyer who already has trained a model might not need a second one right away. A successful automation reduces the cost 10 times compared to human work, and a failed automation might discourage the client from investing again in the same year. Only IT giants like Apple, Facebook, Google, and Amazon have the capacity for constant experimentation regardless of cost, and this is exactly the habitat where Appen has grown so fast in such little time.
LSPs have mastered outsourcing, but they are not good at AI leadership. More often than not, language services companies lack the salespeople required to imagine a use case for the buyer. They’re unable to get their eyes wide open with excitement. LSPs often lack technical directors to transform a half-baked AI idea into a manageable workflow with specific tasks, algorithms, and models. Outsourcing human labor at scale is the juice of data services, but it’s also just one step in a complex AI journey. That’s why LSPs are in the background and not leading the charge.
Specialist AI companies with industry-specific and well-defined use cases are much better at this. Would you rather buy “data services” or “training models to make bone fracture diagnostics from X-ray images,” “machine intelligence,” or “a scanner for social media to look for financial news that affect the stock prices?” The value is in their specificity.
Data services is not a single well-defined activity, but a multitude of tasks. The industry consists of tagging, annotation, rewriting dialogue, subtitling, categorizing text, sorting reviews into positive and negative, and more. Since LSPs crowdsource these tasks to dozens of data workers, managing this at scale via existing workflows in Plunet and XTRF becomes challenging.
Instead, Welocalize, Lionbridge, and Pactera have built custom systems fit for this purpose. Amplexor is building one too, alongside another ten or so MLVs. Smaller players will not be able to support the development effort themselves, especially not without data services bringing in millions to sustain the hungry engineers and their pet server farms.
Mid-sized specialty LSPs with scalability and customer service that have connected with major customers have enjoyed the fastest growth in recent years. As companies add niches and services, and as their offering and client pools increase, they become more difficult to manage and market.
Adding these new and exciting service lines definitely creates the ‘wow effect’. Still, they are unlikely to produce consistent repeat business without giant clients in the IT sector to experiment with AI again and again and invest in data services on a repeat basis. So, this is not a business for everyone.
Only for the select few.
To better illustrate where the localization teams sit within the customer organization, we came up with the Nimdzi Planetary Model.
Last week we spoke about the importance of managing terminology company-wide. Once this challenge is accepted, an organization or team needs to establish some terminology management metrics.
It’s early 2020 and by now it’s not exactly news when you hear someone declaring China as a land of opportunity. Most macro- and micro-economic indicators put the country at the top (or close to the top) of any list of the most dynamic economies of the world.