Over the past two years, Britain has entered a strange new technological standoff—an “AI arms race” not between machines themselves, but between people attempting to use AI tools and institutions attempting to detect that usage. Schools, universities, newsrooms, exam boards, HR departments, publishers and even immigration tribunals increasingly rely on AI-detection tools—systems marketed as able to determine whether a piece of writing was produced by a human or a model such as ChatGPT.
But just as quickly as these detectors are adopted, headlines appear declaring that they are deeply unreliable. Teachers report false positives on student essays. University committees encounter cases where detectors confidently label centuries-old literature as “97% AI-generated.” Journalists find that their own writing is misclassified as synthetic. And repeatedly—perhaps most importantly—AI researchers demonstrate that no existing tool can reliably determine whether text came from an AI or a human being.
This phenomenon creates a public perception that AI models, including ChatGPT, can “deceive” or “bypass” detection systems. The truth, however, is more mundane—and more concerning. AI systems aren't “cheating” in any meaningful sense. Instead, AI detectors fail in predictable, structural, scientifically explainable ways. They fail because they attempt to solve a problem that is, in principle, almost impossible to solve.
In this commentary article, written for a broad UK audience, I aim to unpack why this misconception persists, what the failure of AI detectors reveals about modern AI itself, why British institutions must rethink their reliance on these tools, and how our society should prepare for a future where the line between human and machine authorship is less clear—but no less meaningful.

The idea of an AI detector is immediately compelling. It appeals to our intuitive belief that all tools leave detectable traces—like fingerprints on glass or footprints in soil. If someone uses a calculator in a “no-calculator exam,” we assume there must be a way to prove it. If an essay is written by software rather than a student, surely some subtle signature remains in the writing.
This assumption is reinforced by three broader social narratives:
Technological optimism: the belief that any digital problem can be solved with a digital solution.
Cultural anxiety: a fear that AI models represent an unstoppable tide of “machine-made content.”
Institutional pressure: schools and universities seek clear tools to enforce academic integrity.
Together, these factors create a powerful market demand for systems that can confidently state:
“This text was written by a human”
or
“This text was written by AI.”
Companies race to meet this demand, marketing detectors as advanced, accurate, and increasingly essential—particularly in the education sector. But the scientific reality has not kept pace with the marketing narrative.
At the core of the issue lies the simple truth that AI-generated text is just text. It does not contain hidden metadata, cryptographic signatures, or detectable patterns in its structure. It is composed of normal English words arranged according to the statistical probabilities of the training data—just as human writing is shaped by patterns of language exposure.
Because large language models are trained on vast amounts of human writing, their outputs often reflect the same rhythms, structures, and conventions that human writers naturally use.
This creates a conceptual impossibility:
If human writing and AI writing can look identical, then detecting authorship from the text alone becomes as infeasible as determining whether a person used inspiration, imagination, caffeine, or a conversation with a friend to form their ideas.
That does not mean AI and human writing are indistinguishable in every case. Sometimes they differ markedly. But crucially:
Humans can write like AI if they choose to.
AI can generate text that resembles human writing.
Any detector can only guess based on superficial patterns.
And because the patterns detectors search for are fragile, inconsistent, and context-dependent, the results are inevitably unreliable.
This leads to the next important point.
Perhaps the most troubling outcome of the rise of detectors is how frequently they misidentify human writing as AI-generated. This tends to occur most frequently with:
non-native English speakers
students who use simple sentence structures
writers with highly consistent tone
short-form answers
technical explanations
highly structured essays such as those used in UK A-Level exams
Ironically, while universities and schools often deploy detectors to protect against misconduct, the technology disproportionately harms the very students it is supposed to safeguard.
Several UK cases in 2023–2025 demonstrate this trend:
Students from multilingual backgrounds are flagged at higher rates.
Essays written under exam conditions are frequently identified as “AI.”
Creative writing assignments with uniform tone or limited vocabulary trigger alerts.
The result:
Students must sometimes “prove their innocence” against an algorithm that cannot justify its own conclusions.
No reputable UK academic committee would accept a plagiarism accusation based solely on a detector score. Yet many front-line educators—pressed for time, or unaware of the limitations—still rely on them as authoritative.
This creates a dangerous situation where AI detectors are treated as if they are breathalysers, when in reality they are closer to mood rings: interpretive, unstable, and not a valid basis for disciplinary action.
Here is the heart of the public misunderstanding:
People often claim that “ChatGPT has learned how to bypass detectors,” or “ChatGPT can deceive detection tools.”
That is not accurate.
What is actually happening is:
Detectors were never reliable to begin with.
Human writing and AI writing frequently overlap.
Small changes—AI-generated or human-made—can shift detector outputs dramatically.
Detectors mistake fluency or simplicity for AI authorship.
In essence, AI models are not “evolving” to evade detection. Instead, the detectors themselves are built on an impossible premise.
Britain now faces a pressing policy challenge: how should schools, colleges, and universities handle the rapid expansion of AI-assisted writing?
There are three prevailing responses in the UK:
Prohibition: banning AI tools outright, though enforcement is impossible.
Regulation: permitting limited use under disclosure policies.
Integration: teaching students how to use AI responsibly.
Among these, the prohibition model has proven least effective. It rests on two flawed assumptions:
That AI use can be reliably detected.
That students who use AI do so only for misconduct.
Universities across the UK increasingly acknowledge the futility of detection-based enforcement and are shifting instead toward educational strategies: teaching critical thinking, evaluating process rather than product, and designing assessments less vulnerable to automation.
Detectors may still be used as one small piece of evidence—but responsible institutions now understand that they cannot serve as the foundation of academic integrity policy.
Beyond education, detector misuse raises serious risks across UK public life.
Some UK employers have begun running applicants’ cover letters through AI detectors. This is extremely problematic. A false positive can unfairly limit employment opportunities, particularly among neurodivergent applicants or non-native English speakers.
Newsrooms have used detectors to mark reader submissions or freelance contributions as “suspicious.” This risks undermining trust in legitimate contributors.
There have been global cases (outside the UK) where legal authorities considered detector results in evaluating asylum or visa applications. Such practices are widely condemned by AI ethicists and human-rights experts.
No serious court in the UK would accept an AI detector output as evidence. Yet misunderstanding of the technology’s capabilities creates the possibility of inappropriate use.
In 2023–2025, AI researchers reached a consensus:
Reliable detection of AI-generated text is not feasible with current methods, and may never be feasible at scale.
This stems from several factors:
AI models can be trained on human data that includes all forms of writing.
Humans naturally produce text that resembles statistical patterns.
There is no “biological signature” of human authorship.
Even AI companies admit they cannot reliably detect their own models’ outputs.
Some have proposed watermarking future models, embedding invisible patterns into generated text. This may help in narrow contexts, but watermarks can be easily lost, removed, or disrupted. And importantly, watermarks do not apply retroactively to existing models.
This means detection will always lag behind generation.
Given these realities, the UK must adopt a forward-looking strategy that moves beyond the false hope of text detection.
Instead of banning tools, institutions should teach:
responsible AI use
how to critically evaluate AI-generated content
how to properly disclose AI assistance
This mirrors the shift from banning calculators to teaching mathematics with them.
When educators see the steps of student thinking—draft notes, iterations, reflections—detectors become unnecessary.
Government agencies should openly document how they use AI tools, including their limitations.
The UK must resist the temptation to adopt AI detectors in sensitive contexts. Ethical AI governance requires caution, transparency, and respect for human rights.
Perhaps the most philosophically important implication of AI’s rise concerns the nature of authorship itself.
Human writing has never been purely “human.” All writing draws on external influences:
teachers
textbooks
the internet
friends and family
collective linguistic patterns
AI is simply a new kind of influence. That does not mean its use should be unregulated—but it does mean we should rethink simplistic notions of “purity” in writing.
As Britain adapts to this new reality, we must resist technological myths and instead cultivate a culture of informed, critical, ethically grounded engagement with AI.
AI detectors promised certainty in a moment of cultural upheaval. They promised clarity, fairness, and a technological solution to a human problem. But as we now understand, they cannot reliably deliver on that promise.
The idea that ChatGPT can “deceive” these systems reflects a misunderstanding—not of AI, but of the detectors themselves. The detectors fail not because AI is cunning, but because the task they claim to accomplish is intrinsically flawed.
As a nation, the UK stands at a crossroads. We can continue relying on tools that provide false confidence and unfair outcomes—or we can confront the reality that integrity, trust, and critical thinking cannot be outsourced to algorithms.
If we choose the latter path, we will build a more resilient, informed and equitable society—one prepared not only to face the challenges of AI, but to harness its potential responsibly and intelligently.