I recently went down a bit of a rabbit hole researching how large language models actually select content to cite.
I wanted to understand the practical mechanics. When an AI system surfaces a passage from a webpage, what made that passage the one it chose? And more importantly, what does that mean for how we should be writing and structuring content?
The paragraph length problem
One of the things that came out of that research was something fairly obvious once you see it, but easy to miss: paragraph length matters a lot more than most people realise.
Too short, and there isn’t enough substance for a model to extract something meaningful — a two-sentence paragraph rarely contains a complete enough thought to be useful on its own. Too long, and you’re asking the model to consume a significant chunk of context window on a single passage, which makes it less likely to be cited in full and harder to attribute cleanly.
There’s a sweet spot. Long enough to contain a real, complete idea. Short enough to be self-contained and extractable without eating up too much context. We started calling it the Goldilocks length. Not too thin, not too dense, just right.
It’s not an exact science, and it varies depending on the complexity of what you’re explaining. But the principle holds: if you want your content to be cited by AI systems, your paragraphs need to be structured with extractability in mind.
Adding it to SiteVitals
We built this check into SiteVitals as part of our broader AI readiness scoring. It analyses paragraph density across a page and flags content that’s too thin, too dense, or sitting in that useful middle ground.
That was the easy part. The harder part came when we started updating the copy on our own website to reflect everything we’d learned. We wanted to check our own paragraphs as we were writing them — not after the fact, not by running a full audit, but in the moment, on the page, as we were building it.
We couldn’t find a tool that did it
We had a look around for a browser extension that would let us audit a page for this kind of thing while we were working on it. There wasn’t one that did what we needed — certainly not one that combined paragraph density analysis with the broader AI readiness signals we cared about.
So we built one. I used Claude Code to put it together. A morning’s work that produced something genuinely useful. It adds a toggleable side panel to any page, highlights paragraphs as too thin, too dense, or Goldilocks, and runs a full AI readiness audit alongside it — structured data, semantic HTML, E-E-A-T signals, Open Graph, heading hierarchy, the works.
Everything runs in the browser. No data is sent anywhere. It works on any page you’re visiting, whether it’s yours or not.
So we put it on the Firefox Add-ons store
It’s been useful enough in our own workflow that it seemed worth releasing properly. It’s now available as a free extension for Firefox.
SiteVitals AI Audit — Firefox Add-ons →
If you’re thinking about AI search readiness for your own site — or your clients’ sites — it’s a quick way to see how a page looks to a crawler without needing to log into anything. Install it, browse to a page, open the panel.
We’ve also created a Chrome version, but Google have a few more hoops to jump through to get listed in their marketplace, so stay tuned. Hopefully we’ll get this launched ASAP. And the Goldilocks paragraph check is already live in SiteVitals if you want it running continuously rather than on demand.