GEO and AEO

AI visibility isn’t a content problem. It’s a diagnosis problem.

Duane Forrester's three-layer AI visibility model is right. But most marketing orgs can't use it, because the diagnostic step doesn't exist in the workflow.

AI visibility isn’t a content problem. It’s a diagnosis problem.

Duane Forrester published a piece at Search Engine Journal yesterday that I think will be quietly important. The argument is simple: AI visibility isn't one problem. It's three, sitting on three structurally different layers — retrieval, relationships (entities, knowledge graphs), and synthesis. Each has its own failure modes. Each has its own fixes. Diagnose the wrong one and the fix doesn't land.

He's right. The framework is clean and I'd recommend reading it. But the framework on its own doesn't fix anything, because the actual problem in most marketing organisations isn't an absence of mental models. It's an absence of the diagnostic culture that would let anyone use one.

When a brand stops appearing in ChatGPT, the default response is to commission more content. Not because anyone thinks more content is the answer, exactly — but because content is the lever the team already owns, the budget line already approved, the activity that produces visible output in the next status meeting. The instinct to write more is the instinct to do the thing you can do, not the thing the situation requires.

That's the deeper problem. And it's the one nobody is writing about.

The default fix is always the one you can already buy

I've watched this happen at enough mid-market companies to be confident it's structural rather than individual. A brand notices a drop in ChatGPT citations or Perplexity share of voice. The conversation in the room goes something like this. *We need to be more visible in AI search. What's our AI content strategy? We need to brief more pieces, faster, with better structure for retrieval.*

The instinct to write more is the instinct to do the thing you can do, not the thing the situation requires.

Nobody in that conversation has run a log-file analysis. Nobody has checked whether the brand exists as a clean entity in any knowledge graph. Nobody has examined whether their pages are even being retrieved, or whether they are being retrieved but losing the synthesis step to a competitor with stronger third-party validation. The retrieval-versus-relationships-versus-synthesis question is never asked, because asking it would require a kind of work the team isn't set up to do.

So the team does the thing it can do. It writes more.

This is the operational reality Forrester's framework runs into. The three layers are real. But the org chart only has one layer on it, and that layer is content.

What a real diagnostic would look like

If you take the three-layer model seriously, the first thirty days of any AI visibility problem should be diagnosis, not production. The questions are different at each layer and the signals are different at each layer.

Three diverging paths from a single node showing distinct AI visibility failure layers

At retrieval, the question is: can the model fetch our content at all, and is it fetching the right pages for the right queries? The signals live in server logs. GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot. Are they hitting the URLs you care about? Are those URLs returning 200s within the latency budget? Mike King wrote a piece last week about 499 response codes — pages timing out before AI crawlers complete the fetch — and Profound's data showed pages with high failure rates received roughly 18x fewer citation events on average, often zero. That's not a content problem. That's an infrastructure problem masquerading as a content problem.

At the relationship layer, the question is: do you exist as a recognised entity, or as a fuzzy string of tokens that gets pattern-matched against fifty other candidates? Knowledge graph presence, Wikidata, schema markup, consistent entity descriptions across the web, third-party validation. This is the layer where brand actually compounds, and it's the layer almost no one audits because there's no agency selling it as a monthly retainer.

At synthesis, the question is: when the model has retrieved your content and recognised you as an entity, why is the response still favouring a competitor? Usually because the third-party signal stack — reviews, citations, comparisons, mentions in trusted sources — looks weaker. Or because the content itself, while retrievable, doesn't synthesise cleanly into the kind of self-contained answer the model is constructing.

Three different problems. Three different fixes. None of them are "publish more articles."

The retrieval-versus-relationships-versus-synthesis question is never asked, because asking it would require a kind of work the team isn't set up to do.

The metrics that would catch a misdiagnosis

The reason teams keep misdiagnosing is that the dashboards aren't built to surface the right signal. Most marketing dashboards measure published-pieces-per-week and organic-sessions and, if the team is ambitious, share-of-voice in a couple of LLM trackers. None of those tell you which layer is failing.

Content can only fix problems that are actually content problems, which is a smaller share of cases than the industry currently behaves as if it is.

A diagnostic dashboard would have three sections. The first would show retrieval health — AI crawler hit rates, 499 response rates by URL group, time-to-first-byte for AI user agents, percentage of priority URLs successfully fetched by each major crawler in the last fourteen days. This is plumbing data and it sits in server logs. Most teams have never extracted it.

The second would show entity health — knowledge panel presence, Wikidata entity status, schema validation coverage, mentions in third-party sources that the major models are known to weight (Wikipedia, established trade press, structured directories in your category). Not "do you have schema." But: do you exist as a thing the systems can resolve.

The third would show synthesis outcomes — citation share by query class, sentiment of mentions when you are cited, competitor co-citation patterns, whether your content is being cited as the answer or quoted as a counterpoint. This is the layer most existing tools measure, and it's the only layer most teams look at.

The diagnostic move is this: if synthesis is weak, look one layer down before you commission content. If the retrieval data shows a crawler failure rate above twenty percent on priority URLs, no amount of new content fixes it. If the entity data shows you're not in any knowledge graph for your category, no amount of new content fixes that either. Content can only fix problems that are actually content problems, which is a smaller share of cases than the industry currently behaves as if it is.

The accountability question nobody wants to answer

Here's the part of this conversation that doesn't get aired. The three-layer model implies three different owners, and most companies have one. Retrieval is an infrastructure and dev-ops problem. The relationship layer is a brand and PR and structured-data problem. Synthesis is closer to traditional content and digital PR. Each of those reports to a different head, sits on a different team, has a different budget cycle.

When AI visibility starts dropping, whichever team gets the ticket is whichever team the CMO thinks is responsible. And that team will do the thing its budget already covers. The content team writes content. The technical SEO team audits schema. The PR team chases coverage. None of them are wrong, exactly. But none of them are coordinated, and the diagnostic question — *which layer is actually failing* — gets answered implicitly by whoever happens to receive the brief.

This is the same coordination failure I wrote about in the Conductor report piece a few weeks ago. Enterprises ranked scaling AI content as their top priority and the experts in the same report flagged it as a likely cause of visibility collapse. The disconnect isn't a strategy problem. It's that the diagnostic step is missing from the workflow entirely. Production starts before diagnosis is even attempted.

Forrester's three-layer framework is the diagnostic step. But for the framework to actually get used, somebody senior has to sit between the three owners and refuse to authorise spend on the wrong layer. That role doesn't exist in most marketing orgs. The fractional version of it is what I do for some clients, and I can tell you the pattern: the first three months of any engagement is mostly telling people to stop doing things that aren't working, before any new work gets commissioned. It's not a popular position. It's the right one.

Where Lily Ray's analysis fits

Lily Ray published a piece this week tracking 220+ sites publicly identified as customers of AI content platforms. The pattern she describes is the one I wrote about yesterday — climb, plateau, collapse. What's interesting about reading her piece next to Forrester's is that they're describing the same failure from opposite ends.

Ray is showing what happens when teams answer every visibility question with content production. Forrester is explaining why that answer is structurally wrong in most cases. The bridge between the two pieces is the diagnostic discipline neither of them quite spells out: most of the time, when a brand's AI visibility is declining, the cause isn't insufficient content. It's that the content already published isn't being retrieved cleanly, or the brand doesn't exist as a recognised entity, or the synthesis layer is favouring competitors with stronger third-party validation. Adding more content to a system that has any of those problems just adds more pages with the same underlying issues.

Ray's data shows the empirical result. Forrester's framework explains the mechanism. The accountability gap — who in the org is allowed to say "stop producing, start diagnosing" — is the part neither piece addresses, because neither piece is about org design. But the org design is where the framework either gets used or doesn't.

The honest limits

I want to be careful here. The three-layer model is a simplification. In practice the layers bleed into each other — retrieval health affects entity recognition because models use the content they retrieve to build the entity, and entity strength affects synthesis because recognised entities get cited more consistently. The clean separation is useful as a diagnostic prompt, not as a literal description of the pipeline.

I'd also concede that some failures genuinely are content failures. If your content is thin, AI-generated at scale with no editorial floor, or duplicative of better sources, no amount of infrastructure work fixes that. The point isn't that content never matters. The point is that the industry's default assumption — that visibility problems are content problems — is wrong in the majority of cases I see, and the diagnostic step is the missing link.

And the diagnostic tooling is genuinely immature. Citation monitoring is uneven across platforms. Log-file analysis at the granularity you'd want is not standard. Knowledge graph auditing is a specialist skill held by maybe a few hundred people globally. Until that tooling matures, even teams that want to diagnose properly will be doing it with partial signal. That's not a reason to skip the step. It's a reason to be honest about the confidence intervals when you do.

What this means for the next quarter

If you're running marketing at a mid-market business and AI visibility is on your plate this quarter, the move isn't to commission a content sprint. The move is to spend the first month doing diagnosis. Pull the logs. Check the entity. Look at where you're being cited versus where competitors are, and what the third-party signal stack looks like underneath those citations.

Then — and only then — decide what to commission. If the answer is content, fine. Commission content. But commission it for the specific layer-level problem you've identified, not because content is the lever that was already in your hand when the question came up.

The three-layer model is a good piece of thinking. It deserves to become the default mental model. But it'll only matter if the diagnostic culture catches up to it. Most strategy decks being shown to UK businesses right now are operating on assumptions the layered model has just contradicted. The decks won't update themselves. Somebody has to refuse to sign them.

That's the work.

Ready to get started?

Ready to improve your visibility in AI search?

If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.

Book a discovery call