Reddit is the loudest input. That makes it the cheapest target.
Reddit dominates AI search citations. The SEO industry sees opportunity. The real story is a structural vulnerability with a closing window.
Steve Huffman went on stage at Fast Company's summit this week and said the quiet part with the volume turned up. Large language models, in his words, "would not exist as we know them without Reddit." He called Reddit's data "modern oil." He cited Profound's tracking to say Reddit is the most cited platform across all major AI models.
He's not wrong. That's the problem.
If Reddit is the largest single training corpus and the most cited platform in generative answers, the SEO industry's response has been entirely predictable: write a Reddit playbook. Get into the threads. Seed the subreddits. Build karma. Get cited. There are agencies pitching this right now as a service line. There are courses being recorded. There are LinkedIn posts going up daily about "Reddit-first GEO strategy."
I want to argue something different. The dominance of Reddit as an AI training and citation source is not an opportunity. It's a structural vulnerability — for the models, for the brands relying on it for visibility, and for the discovery layer of the open web. The bigger Reddit's share of the citation pie gets, the worse the downstream consequences for everyone involved. Including Reddit.
What Huffman actually told us
Strip out the "modern oil" framing and what's left is a remarkable admission. The CEO of one of the biggest sources of LLM training data is publicly stating that the models are "regurgitating on an absolutely massive scale what they've consumed elsewhere" and that "a large portion of that consumption is actually just the human conversation on Reddit."
Read that twice. He's not saying Reddit is one input among many. He's saying it's a structurally load-bearing one. He's also saying — though he doesn't put it this way — that AI search results are being weighted toward a specific kind of source: anonymous, conversational, opinionated, often unverified, frequently wrong, and heavily skewed toward the demographics that post on Reddit.
This is the part the industry isn't pricing in. Reddit isn't a representative slice of human knowledge. It's a representative slice of people who post on Reddit. The two are not the same, and the gap between them is the gap between what AI search thinks it knows and what is actually true about the world.
The citation-share problem nobody's modelling
When Profound says Reddit is the most cited platform across all models, the natural reaction in SEO is to ask how to get cited there. That's the wrong question. The right question is what happens to the model when one source dominates citation share to this degree.
Reddit dominance is the adversarial vulnerability of AI search.
There's a concept in information retrieval called source diversity. It's the property of a system being more reliable when it draws from many independent sources than when it draws disproportionately from one. Single-source dominance creates two failure modes: bias propagation (whatever the dominant source gets wrong, the system gets wrong) and adversarial vulnerability (whoever can manipulate the dominant source can manipulate the system).
Reddit dominance is the adversarial vulnerability of AI search.
This isn't theoretical. Reddit has been the target of organic manipulation since the platform existed — vote brigading, sockpuppet rings, marketing astroturfing, state-sponsored influence ops. None of that went away when AI started citing it. If anything, the incentive to manipulate Reddit has just increased by an order of magnitude, because now a successful seed doesn't just affect a thread, it can affect what ChatGPT tells millions of people about your category.
The agencies pitching "Reddit-first GEO" are essentially advertising that they've noticed this attack surface. The honest version of the pitch is: *we will help you manipulate the dominant training and citation source for AI search, before everyone else figures out you can.* I'm not being cute about this. That's the actual offer, and it works until enough people are doing it that the signal gets noisy, the platform clamps down, the models adjust, or some combination of all three.
Why Huffman is happy to say this out loud
It's worth asking why a CEO would publicly position his platform as the indispensable ingredient in a multi-trillion-dollar industry. The framing isn't accidental. It's negotiating posture.

Reddit has two licensing deals — Google and OpenAI, both from 2024. It has lawsuits against Anthropic and Perplexity. Huffman's "modern oil" line is aimed squarely at the next round of negotiations: with Anthropic, Meta, xAI, and whoever else needs the corpus to compete. Every time he says it on stage, he raises the price of the next deal.
The licensing strategy is rational. The implications for the open web are not.
What Reddit is doing — and Stack Overflow, and increasingly X — is converting open-web content into a private licensing asset. Content that users contributed under one set of expectations (a public forum) is now being sold under a different one (a closed training pipe). The users who wrote the content don't see the cheque. The publishers competing for AI citation share don't get a comparable deal. The open web that fed the early models is being progressively enclosed by the platforms that hosted it.
That's the loop. And we built it.
The brand-visibility consequence
Here's where this lands for anyone actually doing marketing.
If you're a UK service business — let's say a chartered surveyor, or a B2B SaaS, or an e-commerce brand — and you want to be cited in AI Mode or ChatGPT answers for queries in your category, the data says Reddit is one of the highest-leverage surfaces. That's true. The agencies aren't lying about that part.
What they're not telling you is the half-life of the tactic.
A Reddit-driven citation strategy works under three conditions. One: the model's training cutoff includes the recent Reddit content where your brand is mentioned. Two: the platform's anti-manipulation defences don't flag and remove the seeded content. Three: enough of your category competitors haven't already saturated the same threads that your brand still gets disproportionate citation weight.
All three conditions are degrading. Training cutoffs are getting more frequent, but they're also more selective — recent Reddit content with engagement patterns that look manipulated gets weighted down or excluded. Reddit's own moderation, after years of being accused of letting astroturfing run wild, is tightening. And the GEO consultancy market has scaled the tactic so aggressively that the median brand in a competitive category is now seeing the same recommendations to "go to r/marketing" or "find threads in r/smallbusiness" as their competitors.
What you're paying for, in most cases, is a tactic with a 12 to 24 month window before it stops working. That's not nothing. But it's also not a strategy. It's an arbitrage trade dressed up as marketing.
The verification gap getting worse
There's a second-order problem nobody in the SEO industry is talking about, and it's the thing that turns this from a tactical question into a structural one.
Reddit is increasingly being posted on by AI. Not in the obvious ways — not bots writing top-level comments, though that happens too — but in the subtle ways. People drafting their posts in ChatGPT and pasting them. Marketing teams running comment campaigns through LLMs. Researchers using AI to generate plausible-looking discussion threads for whatever reason researchers do that. The signal-to-noise ratio of human conversation on Reddit is degrading at exactly the moment that AI models are treating it as the gold-standard source of human conversation.
There's an arXiv paper out this month flagging the broader version of this problem: generative search engines citing AI-generated sources without detection. The Reddit case is the most acute instance. Models trained on Reddit treat Reddit as ground-truth human knowledge. Reddit is increasingly populated with AI-generated content. The models retrain on that content. The loop closes.
This isn't a far-future scenario. This is happening now, and it's the reason the Huffman "modern oil" framing should make anyone in the SEO industry nervous rather than excited. The most cited source in AI search is also the source with the lowest cost of manipulation, the weakest verification floor, and the fastest-growing share of AI-generated content masquerading as human discussion.
I wrote a few weeks ago about how the machine-readability problem is the SEO problem now. The Reddit citation issue is its mirror image: the verification problem is the discovery problem now. And the industry doesn't have a vocabulary for it yet.
The honest limits of this argument
A few things this piece doesn't cover, and where reasonable people could push back.
I'm not arguing Reddit citation is worthless. For some categories — especially consumer recommendations, troubleshooting queries, and category research — Reddit citation share is genuinely important and will be for years. If you're a brand in those categories and you're not paying attention to how you appear in relevant threads, you're leaving real visibility on the table.
I'm also not arguing the models are about to collapse under their own training data. The major labs are getting better at filtering, weighting, and detecting AI-generated content in their corpora. The verification gap is widening, but the labs aren't standing still either. The likely outcome is a slow degradation of source quality, not a sudden one.
And I'm not arguing Reddit-focused agencies are scams. Some of them are doing genuinely careful work — earning citations through actually useful contributions, building brand presence in legitimate communities, treating Reddit like the editorial surface it actually is. The good ones are indistinguishable from competent PR. The bad ones are the ones selling "seed 50 threads a month" packages.
The argument is narrower than the headline suggests: Reddit's structural position in AI search is unstable, the tactical opportunity has a shorter half-life than the industry is pricing in, and the second-order consequences for content authority and verification are being ignored almost entirely.
What this means for how you spend the next year
If you're running a brand and trying to decide where to allocate budget for AI search visibility, Reddit is one input. Treat it like one. Not the input.
The brands that come out of the next two years in a stronger position are the ones building defensible discovery assets that don't depend on a single platform's continued dominance or a single tactic's continued effectiveness. That means earned media in publications the models trust. It means structured data and clean technical foundations that survive across retrieval architectures. It means brand recognition strong enough that the models cite you regardless of which corpus they were trained on this quarter.
The brands that come out worse are the ones being sold "Reddit-first" as a complete strategy, paying for tactics that will work for 18 months and then leave them with nothing.
Huffman's framing is good for Reddit. It's good for the licensing deals Reddit is about to sign. It is not, in any meaningful sense, good news for the rest of us. The most cited platform in AI search being a platform with Reddit's structural properties — anonymous, gameable, increasingly AI-populated, demographically skewed — is a problem the industry is choosing not to look at because the short-term arbitrage is too tempting.
The arbitrage will close. The structural problem will stay. The brands that build for the second condition, not the first, will be the ones still visible when it does.
Ready to improve your visibility in AI search?
If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.
Book a discovery call