Key takeaways
- Traditional rank positions have almost no correlation with AI citation frequency -- a page ranked #1 on Google can be completely absent from AI answers, and vice versa.
- A SparkToro study of ~3,000 prompt tests found ChatGPT returned the same brand list less than 1% of the time, and the exact same rankings less than 0.1% of the time. "Position" in LLMs is largely meaningless.
- 75% of AI citations go to pages outside Google's top ten, meaning your SEO rank is not a reliable proxy for AI visibility.
- The metrics that actually matter: citation rate, share of voice across AI engines, answer gap coverage, crawler activity, and traffic attribution from AI referrals.
- Tools built specifically for AI visibility tracking -- like Promptwatch -- go beyond monitoring to help you find gaps and create content that fixes them.
The uncomfortable truth about your rank tracker
You open your rank tracker on a Monday morning. Position 1 for your main keyword. Position 2 for a cluster of supporting terms. The dashboard is green. Everything looks fine.
Then a colleague mentions that a customer told them they asked ChatGPT for a recommendation in your category, and your brand wasn't mentioned. Not buried -- completely absent. The AI recommended three competitors instead.
This is happening to brands everywhere right now, and the rank tracker has no way to tell you about it. That's not a bug in the tool -- it's a fundamental mismatch between what rank trackers were built to measure and how people are actually finding information in 2026.
The problem isn't that rank tracking is useless. It's that it's incomplete in a way that most marketing teams haven't fully reckoned with yet.
Why "position" doesn't exist in LLMs
Here's the thing that makes AI search genuinely different from Google: there is no stable list.
Tony Pataky, Director of SEO at Procore, discussed this directly in a recent podcast episode on why AI rank tracking is broken. He referenced a SparkToro study by Rand Fishkin that ran roughly 3,000 prompt tests across ChatGPT, Claude, Gemini, and Google AI with about 600 volunteers. The finding: ChatGPT returned the same list of brands less than 1 in 100 times, and the exact same "rankings" less than 1 in 1,000 times.

Think about what that means. When you check your Google rank for "best project management software," you get a stable answer. Position 4 today will probably be position 4 tomorrow. You can track movement, spot drops, celebrate gains.
When you ask ChatGPT the same question, the model synthesizes an answer from its training data, retrieval sources, and probabilistic generation. Ask it again in five minutes and you might get a different set of brands in a different order with different framing. The concept of a "rank" -- a fixed position in a fixed list -- simply doesn't apply.
This means any tool that claims to give you an "AI rank" for a keyword is, at best, giving you a snapshot of one response at one moment in time. At worst, it's giving you false confidence that you're tracking something stable when you're not.
The citation gap that rank data can't see
There's a related problem that's even more disorienting for teams that have built their strategy around Google rankings.
Research consistently shows that roughly 75% of AI citations go to pages that don't appear in Google's top ten results. That's not a small rounding error -- it means the majority of content AI engines are pulling from and recommending to users is content that traditional SEO metrics would classify as underperforming or invisible.
Why does this happen? AI models don't rank pages the way Google does. They're looking for content that directly answers a specific question, contains authoritative and specific information, and is structured in a way that's easy to extract. A page that ranks #1 for a broad head term might be optimized for click-through rate and keyword density in ways that make it less useful as a citation source. Meanwhile, a detailed FAQ page sitting at position 23 might be exactly what an AI model pulls when someone asks a specific question.
This creates a situation where your rank tracker is measuring the wrong game entirely.
What traditional rank tracking gets wrong about AI search
Let's be specific about the gaps, because they're not all obvious.
Rank trackers assume a single, consistent SERP. Google's results are personalized and vary by location, but they're stable enough to track. AI answers are generated fresh each time, vary by phrasing, and differ across models. A "rank" in this context is a fiction.
Rank trackers measure your position, not your presence. Whether you appear at all in an AI answer is a binary question -- cited or not cited. The concept of position 1 vs. position 5 is much less meaningful when the AI might mention you once in a paragraph alongside three competitors, or not at all.
Rank trackers don't tell you why you're invisible. If you drop from position 3 to position 7 in Google, you can investigate: did a competitor gain links? Did your page lose authority? Did Google's algorithm update affect your category? If you're not being cited by ChatGPT, a rank tracker gives you nothing to work with.
Rank trackers don't cover the right channels. Most rank trackers are built for Google, with some Bing coverage. They have no visibility into ChatGPT, Perplexity, Claude, Gemini, Grok, DeepSeek, or any of the other AI engines that are now part of how people research purchases and find recommendations.
The metrics that actually matter in 2026
So if position data is broken for AI search, what should you track instead? Here's a practical framework.
Citation rate and share of voice
The first question to answer is simple: when someone asks an AI engine a question relevant to your category, do you appear in the answer? And how often, relative to competitors?
This is called citation rate -- the percentage of relevant prompts where your brand or content is mentioned. Share of voice extends this to show you how your citation rate compares to competitors across the same prompt set.
These metrics are meaningful in a way that position isn't, because they're asking a binary question (cited or not) across a large sample of prompts, rather than trying to assign a stable rank to an inherently unstable output.
Answer gap coverage
This is where the real opportunity lives. Answer gap analysis maps the questions and prompts that AI engines are answering in your category against the content you actually have on your site. The gaps -- prompts where competitors are cited but you're not -- are your content roadmap.
If ChatGPT consistently recommends a competitor when someone asks "what's the best tool for [your use case]," the question isn't "how do I rank higher?" It's "what content am I missing that would make me a credible citation source for that question?"
AI crawler activity
AI engines don't just retrieve content from the web in real time -- they crawl and index it, much like Google's spider. Knowing when GPTBot, ClaudeBot, PerplexityBot, and others are hitting your pages, which pages they're reading, and whether they're encountering errors is genuinely useful signal.
If a crawler visits a page repeatedly but you're never cited from it, something is wrong with how that content is structured. If a crawler hasn't visited a key page in months, it may not be in the model's awareness at all.
Referral traffic from AI engines
This one is imperfect but worth tracking. When Perplexity or ChatGPT cites your content and includes a link, some users click through. That traffic shows up in your analytics as a referral from the AI engine's domain.
The catch, as Pataky notes, is that LLM referral traffic often reads under 1% even when AI is influencing 30-50% of buyer journeys. Users don't always click the citation links -- they take the recommendation and go search for the brand directly. So referral traffic is a floor, not a ceiling, on AI's actual influence.
Post-conversion survey data
This is the underrated one. A simple "how did you hear about us?" survey at checkout or sign-up, with AI search as an explicit option, often reveals that AI was part of the journey far more often than referral data suggests. Some teams find AI was involved in 30-50% of buyer journeys even when their analytics show almost no LLM traffic. That gap is the dark funnel that rank trackers will never capture.
A comparison of what to track (and what tools help)
| Metric | What it tells you | Rank tracker covers it? | AI visibility tools cover it? |
|---|---|---|---|
| Google position | Where you rank in traditional search | Yes | Partial |
| Citation rate | How often AI engines mention you | No | Yes |
| Share of voice | Your visibility vs. competitors in AI | No | Yes |
| Answer gap coverage | Which prompts you're missing | No | Yes (advanced tools) |
| AI crawler activity | Which bots are reading your pages | No | Yes (advanced tools) |
| LLM referral traffic | Clicks from AI citations | No | Partial |
| Post-conversion attribution | AI's role in buyer journeys | No | No (needs surveys) |
The tools built for this problem
A new category of platforms has emerged specifically to track AI visibility. They vary significantly in depth.
Some tools give you basic monitoring -- you set up a list of prompts, they run them periodically, and they show you whether your brand appeared. That's useful as a starting point, but it's still just a snapshot.
Promptwatch takes a different approach. Rather than just showing you where you appear, it's built around an action loop: find the gaps, create content to fill them, track the results. The Answer Gap Analysis shows which prompts competitors are winning that you're not. Content Agents generate articles grounded in real prompt data to fill those gaps. And AI Crawler Logs show you exactly which pages GPTBot, ClaudeBot, and others are reading, when they return, and when a crawled page moves to an actual citation.

For teams that want to start with monitoring and build from there, there are several options worth knowing about.
Otterly.AI covers brand mention tracking across ChatGPT, Perplexity, and Google AI Overviews. It's a solid monitoring tool for teams getting started with AI visibility.
Otterly.AI

Profound is an enterprise-focused platform with strong tracking across multiple AI engines and good competitor benchmarking.
Profound

Peec AI offers AI search visibility tracking with a clean interface, good for marketing teams that want prompt-level data without a steep learning curve.
AthenaHQ focuses on brand tracking across AI engines with solid monitoring capabilities, though like most tools in this space, it's primarily a visibility tracker rather than an optimization platform.
For traditional SEO rank tracking that's starting to incorporate AI features, Semrush and Ahrefs both have some AI search coverage, though their core strength remains Google-focused.
How to build a measurement baseline right now
You don't need to overhaul everything at once. Here's a practical starting point.
First, identify 20-30 prompts that represent how your target customers actually ask for help in your category. Don't just translate your top SEO keywords into questions -- think about the conversational, exploratory prompts someone would type into ChatGPT when they're trying to solve a problem. "What's the best tool for X" is different from "help me choose between X and Y" is different from "how do I solve Z problem."
Second, run those prompts manually across ChatGPT, Perplexity, and Gemini. Document whether you appear, where competitors appear, and what content is being cited. This is your baseline.
Third, set up automated tracking for those prompts using one of the tools above. You want to see trends over time, not just snapshots.
Fourth, add a "how did you hear about us?" field to your conversion flow with AI search as an explicit option. This is free to implement and will give you qualitative signal that no tool can provide.
Fifth, check your server logs or set up crawler monitoring to see whether GPTBot and other AI crawlers are visiting your key pages. If they're not, that's a technical problem to solve before worrying about content.
The deeper shift to internalize
The reason rank tracking feels broken for AI search isn't just a tooling problem. It reflects a genuine change in how information flows from content to user.
In the Google era, the path was: you create content, Google ranks it, users click through to your site. The rank was the mechanism of discovery, and tracking it made sense.
In the AI era, the path is: you create content, AI engines read it and incorporate it into synthesized answers, users get recommendations without necessarily clicking anywhere. The citation is the mechanism of discovery, and the rank tracker was never designed to see it.
This doesn't mean Google rankings are irrelevant -- they're still a useful proxy for authority and a signal AI models use when deciding what to cite. But they're one input among many, not the primary metric.
The teams that figure this out first will have a real advantage. Not because they'll game some new algorithm, but because they'll understand what their customers are actually seeing when they ask an AI for help -- and they'll be in those answers instead of watching competitors take the recommendation.
Start measuring what's actually happening. The rank tracker can stay, but it can't be the whole story anymore.


