Are We Heading for a Tidal Wave of AI Slop?

Do you feel that the quality of AI content has taken a turn for the worse in recent months? I certainly do.

AI-generated content is eating itself and the results are not pretty.

The feedback loop nobody wanted

To understand this issue, you have to reflect on the evolution of large language models (LLMs).

When LLMs first burst into mainstream use, they were trained on vast swathes of human-written text: journalism, academic papers, forum discussions, blog posts, technical documentation, etc.

This was generally good quality content. Decades (centuries?) of accumulated human knowledge and expertise, painstakingly researched and written by people who actually knew what they were talking about.

Fast forward to today, and an increasingly significant proportion of the content being published online, e.g. articles, blog posts, social media copy, product descriptions, FAQs, etc, is being generated by AI.

The fundamental issue with this is that the next generation of LLMs is trained on a pool of content that is inexorably being diluted with AI-written material. That is not a subjective, or hypothetical, problem – it is happening right now.

Shit in, shit out

“Garbage in, garbage out” (GIGO) is a well known computing principle that states that the quality of an output is directly determined by the quality of the input. I have always embraced this philosophy, but prefer the more colourful version : shit in, shit out.

I generally find that it applies to pretty much everything in life.

The logic is brutally simple. If you feed a model low-quality / inaccurate content during training, that model will produce low-quality / fabricated content in return. If that output is then published and fed back into the next model’s training data, the problem compounds. Each generation of AI learns partly from the mistakes of the last, then confidently reproduces and elaborates upon them.

It is a downward spiral.

The accuracy problem is getting worse, fast

Anyone who works with AI-generated content regularly will have noticed that factual accuracy is becoming an increasingly serious concern.

Statistics get subtly wrong. Dates shift. Names get muddled. Quotes are attributed to the wrong people or are entirely fabricated. Technical details that were broadly correct a year ago are now presented with a confidence that masks a growing sloppiness.

The terrifying bit is that even the big boys are at it.

Part of the issue is volume. The sheer quantity of AI-generated content now flooding the internet means that the signal-to-noise ratio is deteriorating rapidly. Search engines, which once helped to surface authoritative, well-researched material are increasingly indexing content that looks credible but has never been within touching distance of a human expert.

The more this content gets cited, shared, and repackaged, the more legitimate it appears…… and the more likely it is to end up in the training data for tomorrow’s models.

Why this matters beyond the obvious

You might reasonably argue that this is not a new problem. The internet has always had plenty of poor-quality content. I am not going to argue against that. But I would like to think that there is a meaningful difference between human-written nonsense and AI-generated nonsense.

Human nonsense tends to be obviously human: passionate, idiosyncratic, sometimes wrong in ways that betray the author’s bias or ignorance. AI-generated nonsense wears the clothes of authority. It is typically well-structured, grammatically impeccable and written in a tone that signals confidence and expertise. This is the case even when the underlying content is entirely unreliable.

That combination of superficial polish and factual fragility is particularly dangerous in industries where accuracy genuinely matters: healthcare, finance, law, technical engineering, etc.

When a piece of content looks like it was written by a specialist but was actually produced by a model that was itself trained partly on AI-generated content, the potential for harm is significant.

What can actually be done about it?

As far as I am aware, there is no simple technological fix on the horizon. Watermarking AI content is technically challenging.

What we do have in our battle against the machines is genuine human expertise.

AI is a genuinely useful tool for content creation. It can speed up research, help with structure, overcome the blank-page problem and produce solid first drafts at a pace no human team can match. I fought this for some time but now acknowledge that you are a fool if you dismiss it entirely.

But here is the critical caveat: AI-assisted content needs a very close review from an experienced human being.

Not just any human being. It needs someone with deep, practical knowledge of the subject matter in question. Someone who will immediately spot when a statistic does not look right, when a technical claim contradicts industry reality or when a piece of advice would get a real client into serious trouble.

A good editor will catch grammatical errors. What catches factual errors is subject-matter expertise, and that is not something you can automate.

The bottom line

AI-generated content training on AI-generated content is a loop with only one direction of travel, and that direction is down. The tidal wave of slop may not yet have fully arrived, but I am afraid to say that the tide is very clearly coming in.

Using AI to help produce content is not the problem.

The problem is publishing it without rigorous human scrutiny. Every piece of AI-assisted content that goes out under your brand should pass the critical eye of someone who genuinely knows the industry. Not someone who is vaguely familiar with it. Not someone who can sense-check the grammar and the tone. Someone who knows enough to catch the things that sound right but are not.

In an environment where the LLMs are increasingly learning from each other, the most valuable thing you can bring to content is something AI simply cannot replicate: real expertise, hard won and applied with genuine care.