SEO

Google-Agent doesn’t read your robots.txt. That’s the small problem.

Google-Agent bypasses robots.txt — but the real story is what agents do to attribution, conversion data, and the single-bucket model of 'Google traffic'.

Google-Agent doesn’t read your robots.txt. That’s the small problem.

On March 20, Google added a new fetcher to its official list. Not a crawler. Not a training bot. An agent — the thing that visits your website when somebody asks Gemini to research a product or fill out a form on their behalf. The user agent string is `Google-Agent`. The first product using it is Project Mariner.

The line in Google's documentation that's getting all the attention is this: Google-Agent, as a user-triggered fetcher, generally ignores robots.txt. The logic being that the agent is acting as a human's proxy, and humans don't consult robots.txt before clicking a link.

Most of the coverage has stopped there, framed it as a robots.txt loophole, and moved on. That's the small problem. The bigger one is what this category of visitor does to the assumptions baked into every SEO and analytics stack built over the last fifteen years.

We've been treating Google as one visitor for a decade. It isn't.

For most of the modern web's life, "Google traffic" meant one thing: a person clicked a blue link in a SERP and arrived on your page. The crawler that found the page (Googlebot) and the human that visited it were cleanly separated. Logs showed Googlebot. Analytics showed humans. The boundary was clean.

That model has been quietly eroding for two years. AI Overviews started intercepting clicks. AI Mode started answering queries without sending traffic at all. Then ChatGPT-User and Claude-User started showing up in logs — agents fetching pages in real time on behalf of someone in a chat window, then summarising them into an answer that never required the user to visit your site.

Google-Agent is a different beast again. It's not crawling for index. It's not fetching to summarise into an answer. It's *acting*. It's reading a product page so it can recommend a purchase. It's filling out a contact form. It's comparing two services and clicking "book." It's doing the thing the human would have done, except the human never sees your page.

So your "Google traffic" is now at least four separate things:

VisitorWhat it doesShows up where
GooglebotIndexes pages for SearchLogs only
Human via SERPClicks, reads, convertsAnalytics + logs
AI Overviews / AI ModeSummarises your content into an answerLogs (sometimes), no analytics
Google-AgentActs on the user's behalf, may submit formsLogs only, distorts conversion data

These four visitors have different intentions, different conversion implications, and different value to your business. Most analytics setups still treat them as a single bucket called "organic" or "google / organic." That was a defensible simplification in 2018. It's actively misleading in 2026.

The form submission problem nobody is talking about

Here's the bit that should be making CRO and lead-gen people nervous, and isn't yet.

The agent acts. The attribution doesn't.

Google-Agent can fill out forms. Project Mariner's whole pitch is task completion — including form completion. So when a Google-Agent visits your "request a quote" page on behalf of someone researching agencies, and the agent submits the form with the user's details, what shows up in your CRM?

A lead. With a real email address. With no context about how it got there.

The agent acts. The attribution doesn't.

Your sales team rings the lead. The lead has no memory of submitting anything because they didn't — their AI assistant did, possibly across twelve other sites at the same time. They were "comparing options," and the agent's idea of comparing options included submitting forms on all of them.

Your conversion rate looks better. Your lead quality looks worse. Your sales cycle gets weird. And your analytics — which still attribute the lead to "google / organic" because the agent technically arrived from a Google property — give you no signal that anything has changed.

This is the same measurement gap Lily Ray has been writing about with AI content, but expressed at the conversion layer instead of the visibility layer. The numbers look fine until you look at what they actually represent, and then nothing makes sense.

Robots.txt was never the right tool, and now it isn't even a tool

The robots.txt conversation is a distraction. Robots.txt has never been an access control mechanism — it's a polite request, voluntarily honoured. If you needed to actually stop a visitor reaching a page, you used authentication, IP blocks, rate limiting, or a paywall.

Four traffic types diverging from a single source node

What's changed is that the polite-request model assumed two categories: humans (who don't read robots.txt) and crawlers (who do). Google-Agent sits in the middle. It's machine traffic acting for a human, and Google has decided the human's intention is what counts.

You can argue with the principle. OpenAI and Anthropic made the opposite call — ChatGPT-User and Claude-User respect robots.txt even though their logic for ignoring it would be the same. Google picked the more aggressive interpretation. Fine. The practical answer is the same either way: if you need access control, robots.txt was never going to provide it. Build proper controls or accept the traffic.

What's worth more attention is the cryptographic identity piece — Google's experimenting with Web Bot Auth, an IETF draft where each agent signs requests with a private key and websites can cryptographically verify the visitor's identity. Cloudflare, Akamai and Amazon already support it. This actually matters, because user agent strings can be spoofed by a teenager with five minutes and a Python script. Signatures can't.

Once Web Bot Auth becomes standard, you'll be able to write firewall and CDN rules that say "allow verified Google-Agent, block unverified traffic claiming to be Google-Agent." That's a real access policy. Robots.txt was always theatre.

The real story isn't that robots.txt got bypassed. It's that agents are about to have cryptographic identities, and the next generation of access policy will be written against those.

What this means for how you instrument your site

The work to do here isn't strategic. It's plumbing. But it's plumbing nobody's doing yet, which means whoever does it first has clean data while everyone else flies blind.

Three things.

First, segment your logs properly. Stop treating all bot traffic as a single noise floor to filter out. Build dashboards that separately track Googlebot, GPTBot, ClaudeBot (the crawlers), then ChatGPT-User, Claude-User, Google-Agent (the agents), then everything else. The volume of each tells you something different about your visibility profile, and the trend lines are diverging fast.

Second, instrument your forms and conversion endpoints to detect agent submissions. This is harder than it sounds because agents will increasingly look like browsers — they'll have JavaScript, accept cookies, behave plausibly. But honest user agent strings (which Google publishes) plus behavioural signals (instant form completion, no mouse movement, no scroll) plus the Web Bot Auth signatures once they're widespread give you a way to flag agent-submitted leads in your CRM. Even a simple "agent-submitted" tag changes how your sales team handles the contact.

Third, segment your analytics by inferred traffic class. GA4 quietly started tracking some AI assistant traffic this year. It's nowhere near comprehensive, but the direction is right. The metric you actually care about is "humans who arrived because of something I did that an AI surfaced" — and you can't get there from a single "organic" bucket. You need at minimum a split between direct human SERP traffic, AI-referred human traffic, and agent traffic. Anything coarser than that is going to mislead your decisions in 2026.

None of this is exciting work. It's also not the work most agencies are pitching, because the pitch right now is all about AI content production and citation tracking. The unglamorous middle — knowing what's actually happening on your own server — is the bit that's getting skipped.

What this doesn't change

Worth being honest about the limits here. Agent traffic is, today, a rounding error for most sites. Project Mariner is experimental. ChatGPT's agent mode is rolling out slowly. Claude's computer use is largely a developer toy. If you're running a regional plumber's site and trying to decide what to do this quarter, segmenting your logs for Google-Agent is not the highest priority.

What's worth doing now is the diagnostic plumbing — knowing what traffic you're getting and being able to spot when the mix changes — rather than building bespoke agent strategies for traffic that hasn't materialised yet.

The wider point isn't that you need an agent strategy this quarter. It's that the single-bucket model of "Google traffic" is going to break down faster than the SEO industry is ready for, because the industry's measurement frameworks were built for a web where every Google fetcher had roughly the same purpose. That assumption stops holding the moment agents become a meaningful percentage of fetches, and Google adding a dedicated agent identity is the signal that they expect it to be soon.

There's a recurring pattern in this industry where the conversation about a new Google announcement collapses into the most surface-level interpretation available. Robots.txt got bypassed. That's the headline. The actual story — that the categorical separation between crawlers and human visitors has collapsed, that conversion attribution is about to get genuinely weird, that cryptographic agent identity is being quietly standardised — is harder to write and harder to sell.

But the businesses that come out of the next two years ahead are the ones whose plumbing is right. Not the ones with the best content. Not the ones with the most schema. The ones who can look at their analytics, separate the four kinds of "Google traffic" they're now getting, and make decisions based on what's actually happening.

That's the loop. And almost nobody is closing it yet.

Ready to get started?

Ready to improve your visibility in AI search?

If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.

Book a discovery call