Artificial Content and the “Digital Shield”

Report written by Yulia Akhulkova.

2020 was a big year for language technology. It was a year when family, friends, and even neighbors were following the latest language tech trends, particularly the  continuous rise of AI. We saw new AI models being released as well as significant projects centered around content creation and summarization. While some embrace these developments, others have remained skeptical. One lesser-known application for AI, dubbed the “digital shield,” is also set to become a more prominent part of the fight against misleading and manipulative content.

The latest and greatest

 2020 saw the release of:

  • Turing NLG: A language model, introduced in February 2020 by Microsoft, with a capacity of 17 billion parameters. To put this into perspective, there are 85 million neurons — the equivalent of parameters — in the human brain.
  • OpenAI’s GPT-3: One of the largest language models out there, it uses deep learning to produce human-like text and is trained at 175 billion parameters.
  • MT5: A multilingual model trained on over 100 languages and open-sourced by Google in October 2020. The creation of this model didn’t come cheap: training a T5 algorithm costs around USD 1.3 million in cloud computing alone!

2021 is set to be just as exciting a year for the field. In fact, Google has already open-sourced a trillion-parameter model, so this space is one to watch.

Too little and too much content

Whether content is sparse or overabundant, AI models like those listed above can help. Feed them some sample text and they will automatically generate content that matches the original style, tone, and intent of the writer. They can create basic analogies, write recipes from scratch, complete basic code, and the list goes on and on. 

One well-known example is OthersideAI, which uses GPT-3 to turn bullet points into full, personalized emails, as well as to generate documents that can be processed, stored, and indexed automatically. 

Going from one extreme to the other, even when there is too much content the AI models are able to analyze the text and select the most important points. Some technology providers would argue that there is already too much data, yet too little time to process it. This is one reason why 2020 has seen several new AI projects focused on speeding up content perception through text summarization

While text summarization startups have had limited success in creating coherent summaries, larger companies are now getting in on the act, and the results may turn out to be quite a lot more promising. In December 2020, Facebook announced a project codenamed “TLDR” (too long, didn’t read) with the goal of reducing articles to bullet points and providing narration. It also features a virtual assistant to answer questions.

Such technology is also being used to automatically summarize comments on social media for the purpose of gisting and obtaining commenter insights. It also has applications for speech recognition. HENSOLDT Analytics is using AI to extract and summarize information and analyze content. Speech recognition and text processing are now seen as ways to improve information retrieval and analysis that may be needed at a later stage in the game.

So, what content can be trusted?

Whether written by a human or by a machine, content can be manipulative. In fact, it can be crafted in such a way that it’s tantamount to brainwashing. The influence of conspiracy theories is a prime example of this. Fortunately, there are projects aimed at tackling this very issue. In February 2021, Nimdzi had the chance to test out leegle.me, an AI-driven project that aims to detect brainwashing in both speech and in written texts.

AI language models: text checking

Source: Leegle

The idea of creating such a “digital shield” came to one of the founders after his girlfriend was scammed out of  USD 10,000 worth of family heirlooms through “verbal hypnosis.” Together with a friend, who’s a psychologist and linguist, he set out to develop software capable of detecting verbal hypnosis in any text. And, since brainwashing can take a number of various forms, they added in one more analytical tool — the emotional detector, a proprietary invention based on the team’s practical expertise in psychology and advertising.

These two features combine to provide a quick and easy way to detect brainwashing or manipulation in any and all types of text. Leegle has been a long time in the making: the analysis of neuro-linguistic structures took about 12 years, and the software development took another five. Currently, the platform supports 63 languages.

To use the online platform, you simply need to:

  1. Upload a text (the sample must be between 300 and 5000 characters)
  2. Let the tool process the text
  3. Look at the visualized results with charts and reports based on D3 Data-Driven Documents
  4. Read the conclusion, which will help guide you to understand how manipulative the text is.

Leegle is able to detect subconscious influence, emotional influence, and category factor analysis. It also highlights the most influential words.

AI language models: analysis results

Source: Leegle; results from an analysis of Nimdzi content

Technology-wise, Leegle also operates on:

  • IBM Watson Natural Language Understanding API
  • Google Translation API 
  • Google Natural Language API 
  • Leegle’s very own algorithm designed to catch influence through verbal hypnosis and emotional manipulations in texts

What’s particularly interesting about this tool is not only that it can be used by content consumers to check whether they are being influenced, but it’s also a way for content creators themselves to evaluate presentations and publications and to adjust them accordingly.

Of equal note, with AI-generated content becoming more commonplace — which some fear  could exacerbate the “fake news” problem — this technology could be used to help keep AI content in check. 

Is Nimdzi content trustworthy?

We couldn’t help but try out the beta version of Leegle on our own Nimdzi content. Here’s the result:

“Conclusion: Great! The text does not contain hidden artificial suggestions. Emotionally, the text is neutral. It gives a solid representation of information.”

So, the resounding answer is yesNimdzi content is trustworthy. This is great for you, our readers. But a tool that allows readers to understand the emotional influence of content may pose challenges to sales and marketing.

More “Leegalized” content!

Technologies like Leegle can help us, our families, friends, and neighbors stay vigilant about the content we consume and the sources of this information. As we see more and more misleading data and artificial content coming out every day, isn’t it about time we, as readers,  be able to learn just how trustworthy the content we consume really is?

Yulia Akhulkova Data Scientist Nimdzi Insights

This article was researched and written by Nimdzi's VP of Technology, Yulia Akhulkova. If you wish to learn more, reach out to Yulia at [email protected].

14 April 2021

Stay up to date as Nimdzi publishes new insights.
We will keep you posted as each new report is published so that you are sure not to miss anything.

Related posts