Tony Wright • January 28, 2026

The Chunking Myth: Why Mike King's RAG Optimization Theory May Not Survive Contact with Reality

Mike King dropped a 4,000-word manifesto today about why Danny Sullivan is wrong about chunking. His core argument: structure your content into "atomic passages" because that's how RAG systems work. He even built a tool called BubbaChunk to prove it.

I respect King's work. He's sharp, and iPullRank does solid research. But after 25+ years in this business, I've learned to be skeptical when someone selling a chunking tool tells me chunking is essential. Let me walk through the numbers that complicate his thesis.

1. The Lab vs. Production Gap

King demonstrates that splitting a paragraph about machine learning AND data privacy into two separate paragraphs improved cosine similarity scores by 19%. Sounds compelling until you realize that's a single isolated metric in a controlled environment.

Here's what the research actually shows about RAG systems in production: a 2024 study found that RAG-powered healthcare tools reduced diagnostic errors by only 15% compared to traditional AI systems. That's meaningful but not transformative. More importantly, a January 2025 study titled "Enhancing Retrieval-Augmented Generation: A Study of Best Practices" found that the influence of various RAG components and configurations "remains underexplored." Translation: we don't actually know what moves the needle in production environments.

Google Research published over 1,200 RAG-related papers on arXiv in 2024 alone, compared with fewer than 100 the previous year. If the solution were as simple as "chunk your content," would we need this explosion of research trying to figure out what actually works?

2. The "Lost in the Middle" Problem Undermines His Core Argument

King argues that structured, chunked content survives the retrieval process better. But the landmark Stanford/Berkeley "Lost in the Middle" research (Liu et al., 2024) demonstrated something that should give every chunking enthusiast pause.

LLMs exhibit a U-shaped performance curve when processing long contexts. Models achieve highest accuracy when relevant information appears at the beginning or end of input, and performance degrades significantly when information sits in the middle. This degradation occurs even in models explicitly designed for long-context processing. One researcher noted that Claude-1.3 and Claude-1.3 (100K) showed near-perfect accuracy on synthetic retrieval tasks, but other models struggled dramatically when required to retrieve from the middle of their input context.

What does this mean for chunking? Even if your perfectly structured content gets retrieved, where it lands in the context window matters more than how it was structured. You can optimize chunks all day, but if the reranking system drops your content in the middle of a 30-document context, the model may not effectively use it regardless of its atomic beauty.

A document with a pink bar graph labeled

3. The Context Window Arms Race Makes Chunking Less Relevant

King acknowledges the "infinite context" counterargument but dismisses it by citing computational costs. Let's look at where things actually stand.

Claude Sonnet 4 was recently upgraded from 200K to 1 million tokens. Google's Gemini 2.5 Pro offers 2 million tokens. Llama 4 pushes to 10 million. Google's own documentation states that Gemini comes standard with "near-perfect retrieval (>99%)" within its context window. When models can process 1,500 pages of text with 99% retrieval accuracy, the argument for micro-optimizing 200-word passages weakens considerably.

Research from AIMultiple found that Claude Sonnet 4 shows "less than 5% accuracy degradation across the full context window." That's not a model struggling with unstructured content. That's a model handling whatever you throw at it.

Magic.dev's LTM-2-Mini now offers a 100 million token context window, processing up to 10 million lines of code. The entire premise of chunking assumes that models need help digesting content. But the models are evolving faster than our optimization strategies.

4. The Citation Data Doesn't Support Content Structure as a Primary Signal

Here's where King's thesis really falls apart. If chunked, structured content was essential for AI visibility, we'd see it reflected in citation patterns. We don't.

Surfer SEO's analysis of 36 million AI Overviews and 46 million citations found that YouTube accounts for approximately 23.3% of all AI citations, Wikipedia for 18.4%, and Google.com for 16.4%. Video is the single most cited content format across every vertical. Not chunked blog posts. Video.

According to SEMrush data, Reddit is the single most cited website across all LLMs at 40.11% of citations, followed by Wikipedia at 26.33%. Reddit threads are notoriously messy and unstructured. Wikipedia articles are long-form, not atomically chunked. Yet these dominate AI citations.

A 2025 AI Visibility Report analyzing 680 million+ citations found that brand search volume—not backlinks, not content structure—is the strongest predictor of AI citations with a 0.334 correlation. The report explicitly states this "contradicts decades of traditional SEO wisdom."

Perhaps most damning: 68% of pages cited in AI Overviews don't rank in the top 10 of Google for either the main query or any related fan-out query. If traditional SEO signals (including content structure) drove AI citations, this number would be much lower.

Analytics dashboard with line graph trending upward and a pie chart.

5. The Math on AI Visibility Doesn't Favor Micro-Optimization

Let's talk about the actual opportunity cost of chunking optimization.

Research shows that 76.1% of URLs cited in AI Overviews also rank in the top 10 of Google search results. But here's the flip side: ChatGPT Search primarily cites lower-ranking pages (position 21+) about 90% of the time. And 80% of sources cited by AI search platforms don't appear in Google's top 100 at all.

Only 11% of domains are cited by both ChatGPT and Perplexity. Each platform has distinct preferences: ChatGPT relies heavily on Wikipedia (47.9% of its top-10 most-cited sources), while Perplexity emphasizes Reddit (46.5% of citations). Google AI Overviews favor diversified cross-platform presence.

If you're spending significant resources on chunking optimization, you're betting on one model of information retrieval while 89% of domains that succeed on one platform fail on another. The math doesn't support hyper-specialized content engineering.

6. Google Research ≠ Google Search

King cites Google Research papers—Ring Attention, Infini-attention, MemWalker, Mixture of Recursions—to argue that Google is building systems that will reward structured content. This conflates research exploration with production implementation.

Google Research publishes hundreds of papers annually. The path from research to production involves massive pruning. Their own documentation on long context notes that "many of these [strategies like RAG and summarization] are still relevant in certain cases" while also acknowledging that "the default place to start is now just putting all of the tokens into the context window."

That's Google's own recommendation: stuff it all in the context window. Not "carefully chunk your content for optimal retrieval."

7. The Economics Don't Support Chunking as a Competitive Advantage

Pages with First Contentful Paint (FCP) under 0.4 seconds average 6.7 ChatGPT citations, while slower pages drop to just 2.1 citations. That's a 3x difference based purely on page speed.

Content freshness also matters dramatically. AI search platforms prefer to cite content that is 25.7% fresher than what appears in traditional organic results. ChatGPT shows the strongest recency bias, with 76.4% of its most-cited pages updated in the last 30 days.

If you're deciding where to invest limited resources, the data suggests page speed and content freshness deliver better returns than paragraph-level semantic optimization.

Scrabble tiles spelling

8. The Conflict of Interest Problem

I'm not going to pretend this isn't awkward to say, but it needs saying: Mike King is selling BubbaChunk and iPullRank's "Relevance Engineering" services. His financial incentive to convince people that chunking is essential creates an inherent bias.

I've been in digital marketing since 1997. I've watched countless "essential" tactics get sold by people with tools to sell. The pattern is always the same: take a technically accurate observation (RAG systems do chunk content), extrapolate it into an optimization mandate, build tools around it, then market the solution.

This doesn't make King wrong. It means we should demand more rigorous evidence than isolated cosine similarity demonstrations.

9. What Sullivan Was Actually Saying

King frames Sullivan's comments as anti-structure and anti-optimization. I read them differently.

Sullivan's core point was: "We don't want people to have to be crafting anything for Search specifically." He's warning against the same arms race that gave us keyword stuffing, then link schemes, then content farms. Each time Google warned against gaming the system, practitioners argued "but it works now."

Sullivan is signaling that Google is building systems that won't reward heavy optimization. Whether they succeed is another question. But dismissing the warning because current systems can be gamed misses the strategic point.

10. The Real Survival Strategy

The 2025 AI Visibility Report recommends: "Establish entity presence on Wikidata, Wikipedia (if notable), and across 4+ third-party platforms (2.8x citation likelihood increase)." They also recommend structuring content for chunk extraction but note this delivers "up to 40% visibility boost"—meaningful but not the dominant factor.

The report identifies a 40-60% monthly citation drift, meaning platform-specific patterns change constantly. Any static optimization strategy—including chunking—will decay quickly.

Seer Interactive's September 2025 study found organic CTR dropped 61% for queries with AI Overviews. But brands cited in AI Overviews earn 35% more organic clicks. The winning strategy isn't optimizing passage structure. It's being cited at all. And citation correlates most strongly with brand search volume, not content engineering.

The Bottom Line

Mike King makes technically sound arguments about how current retrieval systems work. His understanding of vector spaces and cosine similarity is accurate. But accuracy at the technical level doesn't guarantee effectiveness at the strategic level.

The data shows: brand matters more than backlinks for AI citations. Reddit and YouTube dominate despite messy structures. Context windows are expanding faster than our ability to optimize for them. Platform preferences diverge wildly, with only 11% overlap between ChatGPT and Perplexity citations.

Should you structure content well? Obviously. Good structure helps humans and machines. But investing heavily in passage-level semantic optimization while chasing a 19% cosine similarity improvement? The opportunity cost is too high when the same effort could go toward brand building, page speed, content freshness, and cross-platform presence.

King says Google is "the Celestials" shaping what survives. Maybe. But the survival strategy isn't optimizing for any single system's current preferences. It's being so useful, so authoritative, and so present that every system wants to cite you regardless of how they chunk.

That's harder than buying a tool. But it's more durable.

Wisdom from an Experienced Fractional CMO

By Tony Wright January 28, 2026
Learn practical techniques to make your Lovable or React-based site SEO-friendly. Covers static HTML versions, HTMX migration, prerendering, and more from 25+ years of SEO experience.
By Tony Wright January 15, 2026
Data-driven critique of Mike King's chunking optimization thesis. 680M+ citations analyzed show brand authority, page speed, and content freshness matter more than content structure for AI visibility.
Man operating a large camera setup outdoors. He wears a hat, shorts, and is focused.
By Tony Wright January 13, 2026
Discover how a small tamale shop leveraged a single viral video to compete with major brands. Learn the marketing strategies behind viral success and how authenticity can transform your business reach.
People at a table review graphs and data on a tablet, a clipboard, and a laptop.
By Tony Wright January 13, 2026
Learn why micro-conversions are the missing piece in your marketing strategy. Discover how tracking small user actions can dramatically improve your conversion rates and marketing ROI.
Show More