What you don't measure in AI, you can't manage — it's theater with a new name. Most "GEO reports" floating around are loose screenshots: "look, ChatGPT named us here." That isn't measuring. Real measurement is three metrics and a protocol you can repeat every week and compare. Here it is.
What measuring AI visibility actually means
Measuring AI visibility means knowing, with a number, whether ChatGPT, Perplexity and Google AI Overviews mention your brand when someone asks about your category. Not a feeling, not an anecdote: a figure computed over a fixed prompt set and tracked over time.
Presence, share of voice and citation rate
Three distinct metrics — worth keeping separate, because they measure different things and improve with different levers.
- Presence: you show up in the answer, or you don't. The basic yes/no.
- Share of voice in AI: the percentage of answers, over a fixed prompt set, where your brand is cited versus the competition.
- Citation rate: how often the model attributes you as a source divided by how often it mentions you. It tells you whether you're named in passing or treated as a reference.
How to measure it, step by step
The method is boring on purpose, because boring is what you can repeat without the result depending on who runs it that day.
- Fix a constant prompt set: 20-50 real questions from your category, the ones a customer would ask. That set stays untouched.
- Set a baseline: run the set today against each engine and save the result. That's your zero point.
- Measure every week with the same set and compare against the baseline. The trend matters more than any single number.
What tools and methods exist
You can do it by hand (ask, log, repeat) or with automated monitoring that runs the prompt set against each model and records mentions and citations. The tool isn't the point: the point is a fixed prompt set and the same calculation every week.
| Method | When it fits | Limit |
|---|---|---|
| Manual | Kickoff, few keywords, zero budget | Doesn't scale and depends on who measures |
| Automated | Serious weekly tracking, several engines | Needs the prompt set maintained and the data read well |
Common mistakes when measuring
- Measuring once and bragging. A screenshot isn't a metric.
- Changing the questions every week: you lose the comparison.
- Counting mentions without checking whether the model attributes the source.
- Confusing showing up in one answer with owning the category.