The machine-readability problem is the SEO problem now
AI search visibility isn't a content problem. It's a knowledge architecture problem — and most businesses are paying agencies to solve the wrong one.
For the last two years, the GEO conversation has been about content. Write for AI summaries. Optimise for citation. Get your brand into the answer. Most of the advice in circulation is some variant of that — a content output problem, dressed up in new acronyms.
I think this is wrong, or at least so badly framed that it's leading people to spend money in the wrong places. The actual bottleneck for most businesses showing up in AI search isn't that their content is bad. It's that their knowledge isn't legible to machines in the first place. The expertise exists. The pages exist. They just don't exist in a form the AI Answer Chain can extract, verify, and trust.
That's a structural problem, not a content problem. And it requires a different kind of work — closer to information architecture than to copywriting. Most of the industry isn't framing it that way, and clients are paying the price.
What "machine-readable" actually means
The phrase gets thrown around as if it means schema markup and a clean sitemap. It doesn't. It means: can an AI system extract a discrete fact from your site, attribute it correctly, verify it against another source, and surface it with confidence?
Krishna Madhavan's framing at SEO Week 2026 was useful here. He described the "AI Answer Chain" — the layers of trust, eligibility, safety, grounding, and retrieval that content runs through before it reaches a user. Indexes, he said, are transformed representations of the web, not copies of it. The model doesn't read your page. It reads the representation of your page that the indexing system produced. If that representation is thin — because your facts are buried in PDFs, hidden behind forms, written as marketing prose without discrete claims, or disconnected from any structured graph — you don't get cited. Not because your content is bad. Because it's illegible.
The Search Engine Land audit of Prince Edward Island businesses last week made the same point with depressing clarity. Companies with genuine expertise — biotech, food manufacturing, hospitality, aerospace — were nearly invisible to AI systems. The expertise was real. The machine-readability wasn't. Critical information was trapped in PDFs, locked behind contact forms, or written as narrative rather than fact. Twenty-two percent of B2B buyers now use generative AI for vendor research, per Responsive. If your business can't be parsed, you're not in the consideration set.
Why the content-first framing keeps winning anyway
It wins because it's the easier sale. Writing more content, optimising existing content, producing AI-ready content — these are services agencies already know how to deliver. They have writers. They have processes. They have hourly rates set up for it.
The expertise exists. The pages exist. They just don't exist in a form the AI Answer Chain can extract.
Restructuring how a business exposes its knowledge to machines is a different kind of engagement. It looks more like a technical audit crossed with a data project. It requires someone to look at the way information lives across PDFs, gated assets, CMS fields, schema markup, knowledge panels, and external sources — and rebuild the connections between them. That's not a content brief. That's an architecture project. Fewer agencies sell it, and clients don't ask for it because they've been trained by two years of LinkedIn content to think the problem is messaging.
The expertise exists. The pages exist. They just don't exist in a form the AI Answer Chain can extract.
So the work that actually moves the needle — getting your facts into structured form, linking your entities to verifiable external sources, exposing data that's currently trapped in PDFs, building the schema that lets a model treat your business as a node in a knowledge graph rather than a wall of prose — gets skipped. Clients pay for content production. They get content production. They don't get cited. The diagnosis was wrong from the start.
The agentic shift makes this worse, not better
Mike King's piece on agentic RAG last week is the other half of this argument. The retrieval pipelines that drove AI Overviews in 2023 were linear — query, embed, retrieve top-k, generate. Citation tracking made some sense in that world because the citation set was the retrieval set. If you weren't in the top-k passages, you weren't in the answer, and that was that.

Agentic RAG breaks that model completely. A single user query now triggers five to twenty internal sub-retrievals. The agent plans, retrieves, reads, grades its own draft, decides whether the evidence is sufficient, and goes back for more. Every one of those sub-retrievals is a gate your content has to pass through. Every gate uses different signals — topical, factual, entity-level, structural — to decide what's eligible.
The implication for machine-readability is brutal. In a single-shot pipeline, you needed to be retrievable for the main query. In an agentic pipeline, you need to be retrievable for the twenty sub-queries the agent might generate to satisfy a single user prompt. That's not a content problem. That's a coverage problem. You need your facts to be extractable across many possible angles of the same underlying topic — which means atomised, structured, linked, and verifiable.
If your knowledge lives as long-form prose with no structured representation, you might surface for one of those twenty sub-retrievals. You won't surface for the other nineteen. The model assembles its answer from what it could verify across all of them, and your single appearance gets crowded out by sources that were extractable from multiple angles.
This is the part of the GEO conversation that almost nobody is having properly. Everyone is measuring whether their brand appears in the final answer. The actual leverage is upstream — being eligible for retrieval at all the points where retrieval is happening, which depends on structural legibility, not copy quality.
The case studies tell a consistent story
The Prince Edward Island work is a useful pattern catalogue because the businesses involved are genuinely varied. Biotech with technical authority trapped in corporate PDFs. A food manufacturer whose sustainability story was narrative, not data. A hospitality group whose venue specifications were invisible to agentic search. A fisheries business whose vessel-to-plate traceability existed in physical paperwork but not in any structured digital form.
In every case, the intervention wasn't "write better content." It was: take the knowledge that already exists, atomise it into discrete facts, encode those facts in structured form, link them to external verification sources, and expose them in a way the indexing layer can actually process. Coded heritage and facility schema. Linked data to provincial registries. Structured artisanal provenance. Diagnostic logic triples.
This is information architecture work. It uses some of the same vocabulary as SEO — schema, markup, entities — but the discipline is different. SEO has traditionally been about persuading search engines to rank a page. This is about restructuring the underlying knowledge so it can be retrieved, verified, and assembled into an answer at all. The two overlap, but they're not the same job.
The pattern I see in my own work is consistent: the businesses that struggle most with AI visibility are the ones whose expertise is genuinely deep but lives in formats AI systems can't parse. Decades of case studies in PDF. Pricing information behind a "contact us" form. Service descriptions written as differentiation copy rather than factual specifications. Technical capabilities buried in slide decks that never made it to the public site. The content is excellent. The structure is invisible.
What this looks like as actual work
If you're trying to do this properly, the work breaks into four rough buckets, and almost none of it is what an SEO content team typically delivers.
The first is a knowledge audit. Not a content audit — a knowledge audit. What does your organisation actually know? Where does that knowledge currently live? PDFs, slide decks, gated white papers, internal docs, sales collateral, the heads of three senior people who've been there twenty years. None of that is machine-readable in its current form, and most of it isn't on the public web at all.
The second is atomisation. Knowledge has to be broken down into discrete claims that can be extracted, verified, and cited. "We're a leading provider of regulatory compliance software" is not a fact. "Our platform handles MiFID II transaction reporting for 47 UK-based investment firms" is. The first is marketing copy. The second is a data point an AI system can reason about.
The third is encoding. Structured data, schema markup, entity linking, knowledge graph alignment. This is the bit that overlaps most with traditional technical SEO, but the scope is different. You're not adding FAQ schema to a page. You're trying to make your business itself a node in a knowledge graph — connected to verifiable external sources, with claims that can be cross-referenced against industry registries, regulators, professional bodies, and other authoritative graphs.
The fourth is maintenance. This is the part agencies are worst at, because it doesn't scale into a content production process. Structured knowledge needs to be kept accurate. Facts go stale. Specifications change. Personnel turn over. If your structured representation drifts out of sync with reality, you stop being a reliable source — and AI systems are getting better at noticing that.
None of this is glamorous. None of it produces a content calendar or a deliverable that looks impressive in a quarterly review. It's the structural foundation that determines whether anything else you do has any chance of working, and most businesses don't have it.
Where this argument runs out
I want to be careful not to overclaim. Machine-readability isn't sufficient. You can have perfectly structured data and still not get cited if your brand has no authority, your content has no depth, and your domain has no signal that AI systems can trust. The structural layer is a necessary condition, not a complete strategy.
There's also a legitimate counter-argument that I should take seriously: maybe AI systems are getting good enough at extracting unstructured content that the structural work becomes less important over time. Larger context windows. Better reasoning. More sophisticated retrieval. If a model can read a 40-page PDF and pull out the relevant claim reliably, why bother atomising it?
The honest answer is: that future might arrive, but it isn't here yet, and even when it gets here the structural work will still produce better outcomes. Andrea Volpini's point at SEO Week was that bigger context isn't the answer — semantics and navigation are. Throwing more tokens at the problem doesn't solve retrieval. It just delays solving it. The systems still need structured representations of knowledge to navigate efficiently, and the businesses that have provided those representations will keep winning the retrieval lottery.
There's also a measurement problem hanging over all of this, which I've written about before. Even if you do the structural work properly, the feedback loop is dreadful. You can't reliably tell which structural changes drove which citation gains. The agentic pipelines make this worse. So the work I'm describing requires a degree of confidence in the diagnosis without immediate proof in the data — which is a hard sell, and I understand why agencies default to content production instead.
What I'd do if I were starting from scratch
If a UK business came to me tomorrow asking how to be more visible in AI search, the conversation would not start with content. It would start with: show me everywhere your knowledge currently lives. Show me your PDFs, your gated assets, your slide decks, your internal wikis, your sales decks, your product specs. Show me what's on the public site and what isn't. Show me how a fact about your business — a real, specific, factual claim — would travel from where it actually lives to where an AI system could find it.
In most cases, that journey is broken. The fact lives in a PDF that's behind a form, written as narrative, with no structured representation anywhere on the indexable web. There is no path from where the knowledge exists to where it could be retrieved. No amount of content production will fix that. You can write a hundred blog posts and the underlying knowledge architecture stays invisible to every AI system that matters.
The structural work is the work. The content sits on top of it. Most businesses have spent two years trying to write their way out of a problem that needs to be engineered out instead.
Most businesses have spent two years trying to write their way out of a problem that needs to be engineered out instead.
This is the part of the GEO conversation I think the industry has badly underweighted, and it's the part most likely to separate the businesses that show up in AI answers from the businesses that don't. The acronyms don't matter. The tooling doesn't matter. The content templates don't matter. What matters is whether your knowledge is exposed to machines in a form they can actually use — and for most organisations, the honest answer is that it isn't, and nobody is being paid to fix it.
Ready to improve your visibility in AI search?
If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.
Book a discovery call