The judgment layer is the only part of AI that pays you
Most senior practitioners use AI for drafts and summaries. The actual leverage is in four modes nobody has built a workflow around.
Most of the AI conversation in marketing right now is about output. How many drafts, how many briefs, how many variants, how many summaries per hour. The productivity case is real and I'm not going to pretend otherwise. But the productivity case is also where the value tops out, and the industry is about to discover that the hard way.
A paper presented at the 2025 ASIS&T Annual Meeting by Tim Gorichanaz at Drexel analysed 205 real-world ChatGPT use cases and sorted them into six modes: Writing, Deciding, Identifying, Ideating, Talking, and Critiquing. Duane Forrester wrote about it in Search Engine Journal today, and the framing is the most useful thing I've read about AI in months. Writing alone accounts for 47% of how people use these tools. Identifying — explain this, summarise that — another 10%. Together that's the overwhelming majority of how senior practitioners are using AI day to day.
The other four modes barely register. Deciding sits at 21% of the sample, Ideating at 9%, Talking at 8%, Critiquing at 6%. Those four are where AI does work that compounds. The two that dominate are the work that doesn't.
This isn't a tooling problem. It's a habit problem. And the gap between practitioners who close it and practitioners who don't is going to be the defining career divide of the next eighteen months.
What execution-layer AI actually buys you
When you use AI to draft a brief, write a meta description, generate a content outline, summarise a long article, or translate a paragraph, you are using it as an execution accelerator. The output gets done faster. Sometimes it gets done better. Almost always it gets done cheaper.
The execution layer is being commoditised in real time. Every agency is faster than they were last year. The floor is rising. The ceiling is the same.
That's genuine value. I'm not dismissing it.
But it's also a treadmill. The competitive advantage of "I can produce three briefs in the time you produce one" lasts exactly as long as it takes the person across from you to install the same tools. Which, in this market, is about a fortnight. The execution layer is being commoditised in real time. Every agency is faster than they were last year. Every in-house team is faster. The floor is rising. The ceiling is the same.
McKinsey's 2025 State of AI survey confirms the pattern at the enterprise level. The most commonly reported use cases are content drafting and information capture. 63% of organisations using generative AI apply it primarily to create text. The entire industry has converged on the same mode. Which means the entire industry is competing on the same axis, which means nobody is winning on it for very long.
The productivity gains are real. The strategic gains are not. And the gap between those two things is where the careers get made.
The execution layer is being commoditised in real time. Every agency is faster than they were last year. The floor is rising. The ceiling is the same.
What "judgment layer" actually means
Forrester's piece uses "judgment layer" to talk about the four underused modes — Deciding, Ideating, Talking, Critiquing — and I want to push on that frame a bit because the labels matter.
Deciding is the one most people misunderstand. It doesn't mean "let the AI decide." It means using AI as a structured pressure-test on a decision you're about to make. Here is the context: competitive landscape, current visibility posture, budget constraints, historical performance, three options on the table. Tell me what I'm missing. Tell me which assumption looks weakest. Tell me what a sceptical board member would ask. That is a fundamentally different prompt than "write me a section about X," and it produces fundamentally different value.
Critiquing is the one I find most underused in my own work. Most senior practitioners are excellent at producing arguments and mediocre at stress-testing them. AI is genuinely good at being a hostile reviewer if you ask it to be one. Take a strategy I've just written. Tell me where the logic breaks. Tell me what I'm taking on faith. Tell me what the client's procurement team will object to. That's not draft work. That's judgment work.
Ideating, properly used, is divergent thinking on tap. Not "give me ten blog post ideas" — that's still execution. More like: here's a positioning problem I can't solve, give me fifteen analogies from other industries that might suggest a way through. Most of them will be useless. One or two will unlock something.
Talking is the weird one. Gorichanaz puts it at 8% and it's the mode practitioners are most embarrassed to use, which is a shame, because thinking out loud at an interlocutor is one of the genuinely useful things these systems do. Working through a tangled client situation by typing at Claude for fifteen minutes is often more productive than the same fifteen minutes spent staring at a notebook.
None of these produce immediate, measurable output. None of them generate a deliverable. None of them look like "AI work" in the way a draft does. Which is precisely why nobody builds workflows around them.
Why the industry defaults to execution
The honest reason most practitioners are stuck in Writing and Identifying mode isn't that they don't know better. It's that the pressure to show output is relentless and judgment work doesn't show.

If I spend an hour using Claude to draft a competitive analysis, I have something to send the client at the end of the hour. If I spend an hour using Claude to pressure-test the strategic frame underlying that competitive analysis, I have… a clearer head. Maybe a better question. Possibly the realisation that the entire engagement is pointed at the wrong problem. None of which fits neatly in a status report.
This is how organisations end up with very efficient teams producing very confident work in the wrong direction. The execution treadmill rewards velocity. The judgment treadmill rewards correctness. Velocity gets measured weekly. Correctness gets measured in six months. Guess which one performance reviews are designed around.
There's also a genuine skill issue. Writing-mode prompts are easy. "Draft me an outline for a piece on X" requires almost no setup. Deciding-mode prompts require you to articulate the actual decision, lay out the constraints, supply the context, and frame the question well enough that the output is useful. That's harder than people expect. The first time you try it, the output is usually generic, because the input was generic. The temptation is to conclude AI isn't useful for this kind of work, when actually you just haven't built the muscle yet.
The connection to AI search nobody is making
Here's where this gets interesting for anyone working on AI search visibility.
The entire "GEO" discourse right now is execution-layer. Optimise this schema. Restructure those passages. Add an llms.txt. Get cited on Reddit. These are all real tactics and some of them work. But the harder questions — which queries actually matter for our business, whether a retrieval problem is a content problem or a brand problem, how to allocate effort across SEO and GEO when both need attention and the budget doesn't stretch — those are judgment-layer questions, and almost nobody is using AI to work through them.
The irony is rich. Senior practitioners are using AI to produce the GEO content that other AI systems will summarise, while not using AI to think through whether the GEO strategy itself is correct. The execution gets faster. The strategy stays exactly as good as it was before the tools arrived. Which, if the strategy was wrong, means we're now wrong faster.
I've seen this pattern in my own work. The clients getting the most out of AI right now aren't the ones with the slickest content workflows. They're the ones whose marketing leads have started using Claude or ChatGPT as a thinking partner on questions like "should we be optimising for AI Overviews citation or for direct ChatGPT traffic, given our buyer journey looks like this?" That's judgment work. It doesn't produce a deliverable. It produces a better deliverable from everything downstream.
The honest limits
A few things this argument doesn't cover.
Judgment-mode AI is only as good as the context you supply, and most practitioners are bad at supplying context. If your prompt is "should we prioritise SEO or GEO this quarter," the output will be a generic listicle. If your prompt includes the actual revenue split, the actual competitive position, the actual budget, the actual buyer journey, and the actual constraints, the output starts to be useful. Most people don't have that information organised well enough to paste into a chat window. That's a real barrier and pretending it isn't won't help anyone.
There's also a model-quality consideration. Deciding and Critiquing modes work much better on Claude Opus or GPT-5 than on cheaper, faster models. If your workflow is built around whatever Gemini Flash ships in your suite, the judgment-layer use cases will feel disappointing. The mode shift requires a tool shift, and the tool shift costs more.
And there's a real risk of over-reliance. Using AI to pressure-test a decision is useful. Using AI to make the decision for you is dangerous, because the model has no skin in the outcome, no accountability for being wrong, and no ability to feel the consequences. The judgment layer is where you do harder thinking with AI as an interlocutor, not where you outsource judgment to the machine. Forrester's framing is good on this. Worth reading the original.
What this means if you're a senior practitioner
Stop measuring your AI usage by how many drafts it produces. Start measuring it by how many decisions it sharpened.
The execution-layer use cases will continue to matter. Drafts still need writing, summaries still need producing, briefs still need shaping. But the competitive value of being faster at those tasks is approaching zero, because everyone is getting faster at them at the same rate. The competitive value of being better at the decisions underneath them is going up, because almost nobody is working on it.
If you're running an in-house team or an agency, the question to ask is not "are we using AI enough." Everyone is using AI enough. The question is which modes you're using it in, and whether the four that compound have any structured place in your workflow at all. Mine the gap. There's a lot of it.
The treadmill rewards speed. The career rewards judgment. They are not the same thing, and the next two years are going to make the difference brutally obvious.
Ready to improve your visibility in AI search?
If you're an SME in Surrey or London and you want more qualified leads from search — including the growing AI answer layer — let's talk.
Book a discovery call