GEO and AEO

The agent runtime is the new SEO evaluation layer

AI models don't read your website. The runtime does, and hands the model whatever survived the fetch. That changes the optimisation stack.

The agent runtime is the new SEO evaluation layer

There's a sentence in the Search Engine Journal piece this week that I haven't been able to stop thinking about: *"the model reads what the runtime hands it."*

That's the whole shift. One line. And the SEO industry has barely started reckoning with what it means.

For eighteen months the conversation has been about models. Which one cites more accurately. Which one's crawler matters. Whether to optimise for ChatGPT or Claude or Gemini or Perplexity or Google AI Mode. Every week another comparison, another benchmark, another vendor pitching a tool that tracks "AI visibility" across the four big chatbots as if those four chatbots were the actual readers of your website.

They aren't. They never really were. And after the past three weeks of infrastructure announcements from Cloudflare, OpenAI, and Google, the architecture underneath all of this is now visible enough that pretending otherwise is a strategic mistake.

The model isn't the reader. The runtime is.

What actually shipped

Let me be specific about what happened, because the SEO press largely covered it as a developer story rather than a discovery story, and that framing is wrong.

On 15 April, Cloudflare shipped Project Think — a new Agents SDK built around durable execution, crash recovery, sandboxed code, persistent sessions with tree-structured messages, and isolated sub-agents. The same day, OpenAI released the next version of its Agents SDK with native sandbox execution and a model-native harness. The next day, Cloudflare added five more pieces: a vendor-agnostic inference layer, a managed vector index with chunking specifically for agent retrieval, an email service for agents, Postgres and MySQL inside Workers, and infrastructure to host very large open-source models on its network.

A week earlier, on the Cheeky Pint podcast, Sundar Pichai had already described Search itself as "an agent manager" — saying that information-seeking queries would become agentic, with "many threads running" per query.

Three of the largest infrastructure operators on the public web shipped competing answers to the same question, in the same fortnight. The question was not "which model writes better." The question was: how does a long-running agent actually run in production?

That is a runtime question, not a model question. And the answer is now arriving in production, on real infrastructure, with durability guarantees and audit trails. This isn't a demo. This is the substrate.

Why "the model reads what the runtime hands it"

Think about what an agent actually does when someone asks ChatGPT, Claude, or AI Mode a real-world question — say, *"find me a UK web design consultant who can do a technical SEO audit and ship fixes in WordPress."*

By the time the language model sees anything from your site at all, it's seeing the runtime's interpretation of it.

The model doesn't open Chrome. The model doesn't render your homepage. The model doesn't watch your hero animation finish loading.

The runtime does. The runtime spawns a fetch. The runtime decides whether to execute your JavaScript. The runtime parses your HTML. The runtime resolves (or doesn't resolve) your structured data. The runtime negotiates authentication if there's a paywall. The runtime times out if you're slow. The runtime hands the model a compressed, cleaned, chunked representation of what it found.

By the time the language model sees anything from your site at all, it's seeing the runtime's interpretation of it. Not the page. The interpretation.

The model reads what the runtime hands it.

This is not a small distinction. It rewires the entire optimisation stack.

The wrong question, eighteen months running

If you've been tracking which model cites you, you've been measuring an output four layers downstream of where the actual decisions get made. The runtime decided whether to fetch you. The runtime decided whether to wait for your JS. The runtime decided whether to trust your schema. The runtime decided which chunks to keep. By the time the model "chose" to cite you or not, most of the meaningful filtering had already happened.

Layered bars showing how content is filtered through runtime layers before reaching a model

This explains a lot of weird behaviour people have been logging and assuming was model variance. Why a page cites well in one tool and not in another. Why citations swing wildly across runs. Why a beautifully optimised page with strong content gets ignored while a sparse competitor gets pulled in. A lot of that is runtime variance, not model preference. Different runtimes parse your page differently. Different runtimes time out at different thresholds. Different runtimes decide differently whether to wait for client-side hydration.

Rand Fishkin's research with Gumshoe this week — which I wrote about yesterday — quietly underlines this. Tracking exact AI rankings is mostly nonsense, he found, but tracking visibility share across many prompts and many runs is genuinely useful. Why is visibility share more stable than rankings? Partly because you're averaging out model variance. But also because you're averaging out runtime variance — fetches that timed out, parses that failed, chunks that got dropped. The signal that survives the noise is whether the runtime can reliably *read* your site at all.

That's not a model question. That's an infrastructure question.

Most of the AI visibility playbook is being written about the wrong layer of the stack.

What "runtime-legible" actually means

So if the runtime is the new evaluation layer, what does it actually evaluate?

Your site is competing not just on quality but on cost-of-extraction.

Pillar 1

Whether your important content survives without a browser

The first test. If a runtime fetches your page with a basic user agent, does the content the model needs to see appear in the raw HTML? Or does it only render after a JavaScript framework hydrates client-side?

Most React, Vue, and Next.js sites I audit fail this test in subtle ways. The page works fine for humans. Google can render it. But a sandboxed agent runtime doing a quick fetch — under time pressure, cost-conscious, executing many threads in parallel as Pichai described — will often skip the JS execution and grab whatever is in the initial HTML response.

If your product details, pricing, key claims, or service descriptions only exist after hydration, the runtime is handing the model a half-empty page.

Pillar 2

Whether your structured data resolves cleanly

Schema isn't magic. I've been clear about that. But in a runtime context, structured data is genuinely useful — not because it "tells the AI what your page is about," but because it gives the runtime a stable, machine-readable layer it can extract without making interpretive guesses.

A runtime is making decisions in milliseconds about what to keep and what to discard. JSON-LD that validates cleanly is cheap to extract. Visual layouts that imply structure but don't declare it are expensive. The runtime takes the cheap path under load. Pedro Dias was right that schema doesn't *ensure* anything in a probabilistic system — I agreed at the time — but in a runtime-mediated system, schema raises the probability that the right facts make it past the parsing stage at all. Different argument. Cleaner one.

Pillar 3

Whether your endpoints respond like APIs, not like brochures

This is the one most agencies are entirely unprepared for. Cloudflare's AI Search product, OpenAI's Agents SDK, Google's "agent manager" framing — they all assume an agent will be hitting URLs programmatically, often repeatedly, often in parallel.

If your "About us" page is a 2MB hero video and an animated SVG, that page is functionally invisible to anything that isn't a human with broadband. If your pricing page requires three clicks to reveal numbers, the runtime gives up.

The mental model is shifting from "design pages for humans, hope the bots cope" to "design endpoints that return useful structured information, then layer the human experience on top."

Pillar 4

Whether you're fast enough to get fetched at all

Speed has always mattered. Core Web Vitals matter. But in a runtime world, the threshold isn't "good enough for the user to not bounce." The threshold is "fast enough for the runtime to keep you in the candidate set."

Runtimes have budgets. Time budgets, token budgets, cost budgets. They're spinning up many parallel fetches to assemble an answer. The slow ones get dropped. The unparseable ones get dropped. The ones requiring authentication negotiation get dropped unless they're high-value enough to pay the cost.

Your site is competing not just on quality but on cost-of-extraction.

The honest limits

Three caveats before anyone runs off and rebuilds their stack.

First, model-layer optimisation hasn't stopped mattering. Models still have preferences. Models still trust some sources more than others. Models still pattern-match to authority signals. Runtime-legibility gets you into the candidate set; everything traditional SEO has taught us about authority, brand, and content quality still decides what happens once you're in.

Second, the runtime architecture is genuinely new. Most of what shipped in April is in beta or early production. The behaviours we're inferring about how runtimes parse, prioritise, and discard content are based on a small number of public details and a lot of reading between the lines. Some of this will turn out to be wrong. The direction is right; the specifics will move.

Third — and this is the awkward one — almost nobody has the tooling to actually measure runtime-legibility yet. You can test your own site with curl and a basic fetch to see what raw HTML returns. You can validate your schema. You can audit JS-dependence. But there's no equivalent of Search Console for "how does Cloudflare's Agents SDK see this page?" The measurement gap I keep harping on about is even wider at the runtime layer than at the model layer.

So this is a directional argument, not a paint-by-numbers playbook. The infrastructure is here. The diagnostics are not.

What this means for the work

If you're a marketing manager or business owner trying to figure out what to do with all of this, the practical implication is narrower than it sounds.

You don't need a new vendor. You don't need a "runtime optimisation platform." (Anyone selling you one in May 2026 is selling you a slide deck.) You need to do three things, in this order.

One: audit whether your most important pages — service pages, product pages, location pages, anything you'd want cited — render their core content in raw HTML, without requiring JavaScript execution. If they don't, fix that. This is not a 2026 problem. This is a problem that's been quietly killing AI discovery for two years and is about to start killing it more visibly.

Two: make sure your structured data validates cleanly and actually describes the content on the page. Not aspirationally. Literally. The runtime extracting your schema doesn't care what you wish the page said. It cares what the JSON-LD asserts and whether that matches the visible content.

Three: stop optimising for any single AI model. The model is downstream. Optimise for being legible to a generic, fast, sandboxed fetch from a runtime with a tight time budget. If you're legible to that, you're legible to all of them. If you're not, no amount of model-specific tuning will save you.

The deeper shift here, the one that will take the industry another year or two to fully absorb, is that the unit of optimisation has changed. We spent twenty years optimising for one machine — Googlebot — that read pages and ranked them. Then we briefly tried to optimise for several machines — the chatbots — that read pages and cited them. Now the actual reader is a layer below that: a runtime that reads pages, interprets them, hands them to a model, and orchestrates many such reads across many threads to assemble a single answer.

Your website isn't being read by an AI. Your website is being read by infrastructure that decides what an AI gets to see. Build for the infrastructure and the model takes care of itself. Build for the model and the infrastructure quietly drops you before the model even gets the chance.

The runtime is the new SEO evaluation layer. Most strategy decks haven't caught up. Yours probably can.

Ready to get started?

Ready to improve your visibility in AI search?

If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.

Book a discovery call