Register now for your free virtual pass to the Low-Code/No-Code Summit on November 9. Hear from executives at Service Now, Credit Karma, Stitch Fix, Appian and more. Learn more.
It’s been two years since OpenAI announced the advent of GPT-3, the groundbreaking natural language processing (NLP) application. Users were blown away by the AI language tool’s uncanny ability to generate insightful, considered, and colorful prose — including poems, essays, lyrics, and even detailed manifestos — using the shortest of clues.
OpenAI’s GPT-3 is known as a “basic model” and was trained to feed it practically the entire internet, from Wikipedia to Reddit to the New York Times and everything in between. It uses this extensive data set to predict which words are most plausible at each prompt. Given the size of such a sizeable research enterprise, there are only a handful of these basic models. Others include Meta’s RoBERTa and Google’s BERT, plus others developed by startups, such as AI21.
Nearly all commercial AI text generation applications, from blogging to headline generation, are built on top of one of these basic models via an API. The foundation models are the tracks and the commercial applications are the trains that run along them. And traffic on these tracks is starting to grow, fueled by recent VC investments. This included a $11 million Series A round for AI copywriting tool Copy.ai, $10 million in seed funding for AI content generator Copysmith, and a $21 million Series A round for AI writing assistant writer.com.
Given the vast amount of AI text genes now used in the marketing and communications industry, some of the day-to-day content that marketing and communications teams produce is now generated by AI, including ad copy, social media captions, and blog posts.
Top with little code/no code
Join today’s leading executives at the Low-Code/No-Code Summit virtually on November 9. Register for your free pass today.
While much of this is quite prosaic and raises little concern – if AI can write search ad titles that increase clicks, so be it – legitimate concerns about the effect of this technology in other uses need to be acknowledged and addressed by those of us who these platforms.
Scale alone does not equal success
The heart of the matter for many is the enormous opportunity for scale that this technology offers, and what the consequences of this can be. When you think about this, it’s helpful to distinguish between short-form content and long-form content. The negative impact of organizations scaling up short content creation, such as ad copying or landing page copying, is negligible. Businesses can reduce costs and improve conversions with few drawbacks. It is in the area of content in a longer form that problems arise.
At the lower end of the long-content food chain, such as travel and lifestyle blogs, there seems to be little harm in using natural language processing (NLP) to sift through huge data sets to generate blogs for SEO. After all, is there a difference between NLP using its dataset to generate a 500 word blog post on the “5 best things to do in Denver” and a human writer researching (and slightly plagiarizing) the first few results on Google ? Not much, most likely.
But does the world need more “meh” content written by someone (or something) with no actual expert knowledge of the subject? And who benefits from this other than the platform being paid to generate it? Admittedly, we’ve had this problem before, with content mills producing articles from writers without any knowledge of what they’re writing about. But the cost and time constraints of human writers have at least partially kept this under control. Removing these restrictions could open the floodgates.
Google addresses AI text generation
Google is clearly planning on this, as evidenced by some of its recent search updates. It recently announced the rollout of its “helpful Content” algorithm update that will, among other things, devalue the search results with “low added value”. Meanwhile, Google has also gradually increased the importance of what it calls EA-T — Expertise, authority and reliability — in evaluating content. In short, Google focuses more on the author’s demonstrable expertise and the authority and trustworthiness of the website.
In other words, Google places greater emphasis on content written by proven subject matter experts, providing unique insights, findings, or other value-added that is published on websites with demonstrably robust editorial policies. Since NLP tools generate content based on what has already been written on a topic, they will struggle to provide something unique. So while it’s never been easier to generate a blog post on literally any topic, the bar has never been higher for getting a blog on the first page of Google.
Marketers need to understand this new paradigm and use AI text generation tools where they add value and avoid using them in areas where they don’t, for example to generate optimized ads and sales copy or table of contents, rather than relying on them. confidence to generate technical blog content in longer form.
Automation is not a substitute for expertise
Another area of communication where we’re seeing rapid growth in AI text generation tools is PR, with a number of NLP-powered media pitching platforms now on the market. These range from light personalization based on the Linkedin profile of the intended recipient to writing full pitch on behalf of the company.
Still, I speak from direct experience when I say it’s vital that platform developers and users alike appreciate the problem these platforms have been trying to solve and fully appreciate the part of the process where human expertise cannot be replaced. Essentially, the role of these platforms is to remove friction between an organization or a person and the media, so that users can send pitches to journalists more quickly and accurately. In other words, to act as a door opener.
Users still need to be subject matter experts on the topics they pitch to journalists, and they need to provide valuable insights into their pitches, rather than just spam reporters. Platform developers have some control over the latter by setting sensible limits on how often a reporter can be contacted, while the former is the responsibility of the users.
And platform developers should educate users about the responsible use of these platforms. It only has negative value to connect a journalist to a user, if that user is not qualified to talk about the subject in question. They are quickly discovered and damage their personal and professional reputation in media circles.
A case for the human editor
If language models ever attained true consciousness, one of their defining personality traits would be that of a pathological liar. Anyone who experiments with any of these models will quickly realize that they make it up along the way and generate the most plausible response to a given prompt.
It is therefore not difficult to imagine that some of the problems of AI-generated text remain unchecked in many use cases. For example, content that provides inaccurate financial or health advice can lead to serious harm.
This is why the role of the human editor will become more important as an increasing amount of content is generated by AI. Editors will be crucial within both media outlets and mainstream organizations. In addition, the role of the editor will have to evolve to focus more on fact checking and verification. Much of an editor’s work will also be on training AI models to the desired tone of voice, technical level, and narrative style, in much the same way they train human writers.
Ultimately, we are only at the beginning of the AI technology adoption curve for text generation. It is almost inevitable that its use will become mainstream over the course of this decade, given the enormous scale of the technology and the achievement of efficiencies. What we can expect in response to this is an even greater value being placed on real subject matter experts, who deliver original and unique content. And those who can use AI text generation tools in the right use cases, to remove friction and scale, will benefit the most.
Welcome to the VentureBeat Community!
DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.
If you want to read about the very latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.
You might even consider contributing an article yourself!
Read more from DataDecisionMakers