GEO and AEO

EntityMap and schema can’t fix what’s actually broken in AI search

EntityMap entered consultation today. It's useful work. But the attribution problem in AI search lives after retrieval, where no publishing standard reaches.

EntityMap and schema can’t fix what’s actually broken in AI search

A new open standard called EntityMap entered public consultation this morning. It joins a now-crowded shelf of proposals — schema.org extensions, Microsoft's NLWeb, llms.txt, ai-plugin manifests — all built on the same underlying belief: that if we just give AI systems cleaner, more structured data about our businesses, the hallucinations stop, the citations stick, and the attribution chain holds.

The proposals are mostly sensible. EntityMap in particular is a thoughtful piece of work from Dixon Jones, and the gap it identifies between sitemap.xml and schema.org is real. I'll probably end up recommending parts of it to clients.

But the entire conversation is fighting the wrong half of the problem.

Retrieval was never the hard part. The hard part is what happens after retrieval — the generative step where an LLM takes a correctly retrieved passage and decides what to do with it. That step degrades attribution in ways no amount of structured source data can fix. And until the industry stops treating "publish better structured data" as the answer, we're going to keep watching publishers tighten their schema while their citations quietly evaporate.

The assumption underneath every proposal

Read the EntityMap proposal carefully and the implied causal chain is this: bad AI answers come from messy source data. Make the source data structured, machine-readable, and evidence-linked, and the AI's answer will be accurate, attributed, and traceable.

The schema-for-the-agentic-web piece in Search Engine Land makes the same assumption from a different angle. NLWeb makes it explicitly — it's described as turning your website into something an agent can "interrogate directly" and receive "deterministic, real-time answers" from.

The word *deterministic* is doing a tremendous amount of work in that sentence.

Because the actual architecture doesn't behave that way. A retrieval-augmented generation pipeline has two stages: retrieve the relevant passage, then generate an answer using that passage. The first stage is deterministic-ish — give it the same query and roughly the same corpus, it returns roughly the same chunks. The second stage is not. The LLM rewrites, summarises, blends, sometimes contradicts the source, sometimes drops attribution entirely, sometimes invents attribution that wasn't in the retrieved content at all.

EntityMap improves stage one. Schema improves stage one. NLWeb mostly improves stage one. Stage two is where the trust problem lives.

What the research actually shows

A paper that landed on arXiv this morning makes the point cleanly. The authors held retrieval fixed and varied only how retrieved documents were *represented* to the generator — thirteen different transformations covering selection, summarisation, and reformulation. They measured both answer accuracy and what they called *answer retention*: whether the known correct answer was still recoverable from the document after transformation.

The retrieval layer can be perfect and the attribution can still degrade to nothing.

The finding: answer retention was the primary determinant of generator accuracy. When retention was high, the specific wording, structure, length, and query-dependence of the representation barely mattered. When retention was low, no amount of structural cleverness saved it.

Translate that out of academic register and you get something quite uncomfortable for the EntityMap-and-schema worldview: *the AI's answer depends mostly on whether the model's processing of your content preserves the fact, not on how cleanly you structured the input.* You can hand it a perfectly typed, evidence-linked, attribution-rich passage, and if the generation step compresses or paraphrases the answer-bearing portion out of existence, the citation chain breaks anyway.

This isn't a theoretical edge case. It's the default behaviour of every production LLM when summarising retrieved content. Ask ChatGPT to answer a question from a 2,000-word source document and watch what happens to the specific attributable claim by the time it reaches the user. The publisher name might survive. The exact quote rarely does. The link sometimes does, sometimes doesn't, sometimes points at a hallucinated URL that resembles the real one.

The retrieval layer can be perfect and the attribution can still degrade to nothing.

What this means for the standards conversation

I want to be careful here. I'm not arguing EntityMap is pointless, or that schema markup doesn't matter, or that publishers should stop bothering with structured data. All three improve the floor. Better retrieval inputs lead to better-grounded outputs on average, and on average is how aggregate platform behaviour gets shaped.

amber thread entering a navy lattice intact and emerging fragmented

But the way these proposals are being framed — as solutions to AI misrepresentation, hallucination, and attribution loss — overpromises in a way that's going to embarrass the people advocating them within twelve months.

Three specific problems.

First, the framing implies that publishers have agency over outcomes they don't actually control. If I deploy EntityMap perfectly and Claude still drops my attribution when it summarises, what's the recourse? There isn't one. The standard puts the burden on the publisher and the discretion with the model.

Second, the proposals describe what amounts to a contract — "here's our structured evidence, here's our attribution metadata, please preserve it" — that no LLM provider has agreed to honour. Schema.org works because Google and Bing agreed to read it and act on it consistently. EntityMap arrives in a world where the consumers haven't signed up.

Third, and this is the bit that should make publishers nervous, the standards conversation lets the platforms off the hook. If publishers are responsible for making their content machine-readable, and the AI systems are simply doing their best with what they're given, then attribution failures become the publisher's fault for not implementing the latest spec well enough. The actual locus of the problem — the generative step inside someone else's black box — gets quietly framed out of the discussion.

The part nobody wants to say

The attribution model AI search needs isn't a publishing standard. It's a behavioural commitment from the model providers, enforced by something stronger than goodwill.

The standards-and-schema route is treating a power imbalance as a data quality issue, and it isn't one.

Some version of: *if your retrieval step pulls from a source, your generation step must preserve a working citation to that source, and if you fail to do so, there is a consequence.* The consequence could be regulatory. It could be contractual via licensing deals. It could be reputational once measurement matures enough to make the failures visible at scale. But it has to exist, because without it, "preserve attribution" is just a polite request to a system that has no economic reason to comply.

Rand Fishkin's piece this week made roughly this argument from the political angle — that the only durable solution is collective action, antitrust enforcement, and self-preferencing being made illegal. I'm more sceptical than Rand that the political route arrives in time to matter, but he's right about the diagnosis. The standards-and-schema route is treating a power imbalance as a data quality issue, and it isn't one.

Mike King's *Machine Media* essay framed it differently — that the website is becoming a data source feeding an agent that decides what the user sees. Once you accept that framing, the question of who controls the attribution layer becomes the central economic question of the next decade. Standards that don't bind the agent's behaviour don't move that needle.

What sensible implementation looks like anyway

None of this means doing nothing. The fact that schema and EntityMap can't solve the attribution problem doesn't make them worthless — it makes them necessary-but-insufficient. The right posture is to implement them clearly, with realistic expectations, while treating the actual attribution problem as upstream.

Practically, for the kind of UK businesses I work with:

Implement schema on the pages that matter. Not every page. The product, service, expertise, and location pages where AI systems are most likely to pull from. Completeness beats coverage, as the Search Engine Land piece notes correctly. A fully populated Product or Service schema on twenty pages beats thin markup across two hundred.

Watch EntityMap, but don't be first. The consultation runs to 30 June 2026, launch 1 July. There is no advantage in implementing a draft spec before adoption signals are clear. Read the spec, understand what it's trying to do, wait to see whether OpenAI, Anthropic, Google, or Perplexity say anything about consuming it. If two of those four indicate they'll read it, implement. If none do, file it as interesting and move on.

Stop measuring AI visibility as if attribution were the goal. It used to be. It isn't any more. Citations from AI systems are a leaky, unreliable signal of presence in the answer layer. Brand mention frequency, prompted-recall surveys, and direct traffic from AI surfaces (where measurable via referrer or post-click survey) are sturdier indicators. I wrote about this in the AI search measurement piece — the gap isn't tooling, it's definition.

Build the inputs that survive the generation step. This is the boring bit. Strong brand, distinctive product, clear expertise positioning, content that says specific things only you would say. The reason these survive is they're *retention-resistant* — even when the LLM compresses your content, the specificity that made it worth retrieving in the first place tends to come through. Generic content gets paraphrased into nothing. Distinctive content keeps a fingerprint.

The honest limits

I might be wrong about EntityMap specifically. If the major LLM providers — and it really only takes two — publicly commit to consuming the standard and preserving its attribution metadata through generation, the calculation changes overnight. That's not impossible. OpenAI has talked about source-preservation work. Anthropic publishes about constitutional approaches to citation. There's a world where the spec becomes a coordination point that platforms rally to.

I'm also potentially understating the second-order benefit. Even if attribution still degrades, structured input may shift the *average* answer quality upward enough to matter commercially. A model that misattributes you 40% of the time is worse than one that misattributes 25% of the time, even if neither is what you wanted.

And there's a reasonable counter-argument that says: yes, the generation step is the real problem, but the only way to pressure platforms into fixing it is to first build the infrastructure that makes the failures legible. You can't measure attribution loss until attribution exists in the source. EntityMap might be a precondition for the political fight Fishkin describes, not a substitute for it.

I take that point. It just doesn't change what I'd tell a client to spend on this quarter.

What I actually think

The standards conversation is necessary work being asked to carry more weight than it can hold. EntityMap, schema, NLWeb — all useful, all worth tracking, all worth implementing in the right contexts. None of them are the answer to "why does AI keep getting my business wrong," because the answer to that question lives inside someone else's model, and no publishing standard reaches into that model and constrains it.

Publishers spent fifteen years optimising for an indexing layer they didn't control, assuming Google would honour the implicit contract. The contract broke. Now we're being asked to optimise for an attribution layer we don't control, on the same assumption.

The lesson from the last cycle should have been: don't build your discovery strategy on infrastructure where the counterparty has no commitment to behave consistently. The lesson from this cycle, if we're paying attention, is the same.

Implement the structured data. Don't believe the structured data is enough.

Ready to get started?

Ready to improve your visibility in AI search?

If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.

Book a discovery call