On Sept. 16, Google updated the description of its helpful content system. The system is designed to help website administrators create content that will perform well on Google’s search engine.
Google doesn’t disclose all the means and ways it employs to “rank” sites, as this is at the heart of its business model and precious intellectual property, but it does provide tips on what should be in there and what shouldn’t.
Until Sept. 16, one of the factors Google focussed on was who wrote the content. It gave greater weighting to sites it believed were written by real humans in an effort to elevate higher quality, human-written content from that which is most likely written using an artificial intelligence (AI) tool such as ChatGPT.
It emphasized this point in its description of the helpful content system: “Google Search’s helpful content system generates a signal used by our automated ranking systems to better ensure people see original, helpful content written by people, for people, in search results.”
However, in the latest version, eagle-eyed readers spotted a subtle change:
“Google Search’s helpful content system generates a signal used by our automated ranking systems to better ensure people see original, helpful content created for people in search results.”
It seems content written by people is no longer a concern for Google, and this was then confirmed by a Google spokesperson, who told Gizmodo: “This edit was a small change […] to better align it with our guidance on AI-generated content on Search. Search is most concerned with the quality of content we rank vs. how it was produced. If content is produced solely for ranking purposes (whether via humans or automation), that would violate our spam policies, and we’d address it on Search as we’ve successfully done with mass-produced content for years.”
This, of course, raises several interesting questions: how is Google defining quality? And how will the reader know the difference between a human-generated article and one by a machine, and will they care?
Mike Bainbridge, whose project Don’t Believe The Truth looks into the issue of verifiability and legitimacy on the web, told Cointelegraph:
“This policy change is staggering, to be frank. To wash their hands of something so fundamental is breathtaking. It opens the floodgates to a wave of unchecked, unsourced information sweeping through the internet.”
The truth vs. AI
As far as quality goes, a few minutes of research online shows what sort of guidelines Google uses to define quality. Factors include article length, the number of included images and sub-headings, spelling, grammar, etc.
It also delves deeper and looks at how much content a site produces and how frequently to get an idea of how “serious” the website is. And that works pretty well. Of course, what it is not doing is actually reading what is written on the page and assessing that for style, structure and accuracy.
When ChatGPT broke onto the scene close to a year ago, the talk was centered around its ability to create beautiful and, above all, convincing text with virtually no facts.
Earlier in 2023, a law firm in the United States was fined for filing a lawsuit containing references to cases and legislation that simply do not exist. A keen lawyer had merely asked ChatGPT to create a strongly worded filing about the case, and it did, citing precedents and events that it conjured up out of thin air. Such is the power of the AI software that, to the untrained eye, the texts it produces seem entirely genuine.
So what can a reader do to know that a human wrote the information they have found or the article they are reading, and if it’s even accurate? Tools are available for checking such things, but how they work and how accurate they are is shrouded in mystery. Furthermore, the average web user is unlikely to verify everything they read online.
To date, there was almost blind faith that what appeared on the screen was real, like text in a book. That someone somewhere was fact-checking all the content, ensuring its legitimacy. And even if it wasn’t widely known, Google was doing that for society, too, but not anymore.
In that vein, blind faith already existed that Google was good enough at detecting what is real and not and filtering it accordingly, but who can say how good it is at doing that? Maybe a large quantity of the content being consumed already is AI-generated.
Given AI’s constant improvements, it is likely that the quantity is going to increase, potentially blurring the lines and making it nearly impossible to differentiate one from another.
Bainbridge added: “The trajectory the internet is on is a perilous one — a free-for-all where the keyboard will really become mightier than the sword. Head up to the attic and dust off the encyclopedias; they are going to come in handy!”
Google did not respond to Cointelegraph’s request for comment by publication.
Collect this article as an NFT to preserve this moment in history and show your support for independent journalism in the crypto space.