Semantic Intent Drift: The AI Coding Bug Your Tests Can't Catch
Your AI-assisted codebase has a class of bugs that's invisible to traditional testing. The code compiles. The tests pass. The behavior is wrong.
The Bug You Can't See
Something is going wrong in AI-assisted codebases that nobody is talking about clearly enough.
A team ships a reporting system. Executive briefs and full technical reports go through the same pipeline. The code compiles. All tests pass. Both PDF outputs are generated without error.
But they're identical. 486,337 bytes each. The executive brief — meant to be a concise 9-page overview — renders as a full 16-page technical report. The system works perfectly. It just does the wrong thing.
The team debugs for three weeks. They check the PDF library. They check the template engine. They check the rendering pipeline. Everything looks correct, because everything is correct — at the code level.
The bug isn't in the logic. It's in the intent.
What Is Semantic Intent Drift?
Here's what happened. Somewhere in the codebase, a developer (or an AI assistant) wrote this:
// Technical characteristic driving business behavior
const isExecutiveBrief = analysisDepth === 'quick';Read that line carefully. It's using a technical characteristic — analysis depth — to determine a business behavior — whether this is an executive document. The WHAT (document type) and the WHY (behavioral purpose) have been separated, and the wrong domain is driving the decision.
This is semantic intent drift: the gradual decoupling of what code does from why it does it.
In a manually written codebase, this kind of drift accumulates slowly. A developer makes a shortcut. Another developer copies the pattern. Over months, the semantic meaning of a flag drifts from its original purpose.
In an AI-assisted codebase, this drift is accelerated. AI code generators optimize for correctness — does the code compile, does it pass tests, does it produce output? They don't optimize for intent preservation — does this code still mean what the developer meant it to mean? When you ask an AI to refactor, extend, or modify code, it maintains syntactic correctness while potentially severing the semantic contract.
The result: code that works but behaves wrong. And because the behavior is technically correct (it produces a PDF, it passes the type checker, it returns a response), your test suite never catches it.
The Fix Is One Line
The solution, once you see it, is almost absurdly simple:
// Observable semantic property drives behavior
const isExecutiveBrief = title.includes('executive');Instead of using a technical proxy (analysis depth) to infer document type, derive the behavior directly from the semantic meaning of the document itself. The title says “executive” — so it is an executive document. The WHAT and WHY are unified into a single atomic check.
This is the core of the Semantic Intent pattern: observable semantic properties drive behavioral contracts. Not technical flags. Not HTTP methods. Not boolean parameters. The meaning of the thing determines how it behaves.
But the fix alone isn't enough. You also need to ensure the intent can't be corrupted downstream. That's where immutable governance comes in:
function protectIntent(intent: SemanticIntent): ProtectedIntent {
const frozen = Object.freeze(intent);
return new Proxy(frozen, {
set(target, property) {
throw new Error(
`Semantic contract violation: Cannot modify ${String(property)}`
);
}
});
}Once a semantic intent is derived, it's frozen. No transformation layer can mutate it. No middleware can override it. The behavioral contract is locked from the point of derivation to the point of execution.
The Results
When we applied this pattern to the production reporting system:
Before
Identical 486,337-byte PDFs. Zero behavioral differentiation. Three-plus weeks of debugging with no resolution.
After
Executive briefs: 9 pages. Full reports: 16 pages. 78% behavioral differentiation. Resolution time: one session.
These numbers are from a single production case study with tracked validation IDs. The research is published with a DOI (10.5281/zenodo.17114972) and the implementation is open source with the complete git history, including the breakthrough commit.
Why This Is About to Get Much Worse
Right now, most teams using AI-assisted coding are in the honeymoon phase. They're shipping faster. Generating more. Feeling productive. The codebases are growing rapidly with AI-generated contributions.
The maintenance phase hasn't hit yet. But it will.
When it does, teams will discover that AI-generated code has a specific failure mode that traditional code doesn't: it's locally correct but semantically drifted. Every function works. Every test passes. But the aggregate behavior of the system has silently diverged from its intended purpose — because the AI optimized for correctness, not for intent preservation.
This isn't a hypothetical. The PDF case study is one instance of a pattern that's going to become endemic. Every system where AI generates code that encodes business behavior — report generators, API handlers, configuration managers, workflow engines, form processors — is a candidate for semantic intent drift.
The organizations that formalize intent preservation now will have a structural advantage. Not because they'll write better code — but because they'll be able to trust their AI-assisted codebases at scale. Intent preservation is the difference between an AI that helps you build faster and an AI that builds you a beautiful system that slowly does the wrong thing.
Try It Now
The Semantic Intent pattern isn't just a research paper. It's a practical tool you can apply today.
Template Generator
Pick your context. Fill in the semantic fields. Copy the template. 30 seconds to add intent preservation to any piece of code.
Read the Paper
Full research with the complete case study, mathematical formalization, implementation evidence, and immutable governance framework.
The implementation is open source on GitHub. Complete git history. The breakthrough commit is 7de571c.