AI Visibility API Pagination and Data Volume: How to Pull 6 Months of Citation History Without Hitting Limits in 2026

Pulling 6 months of AI citation history via API sounds simple until you hit rate limits, pagination gaps, and inconsistent data schemas. This guide shows you exactly how to do it right in 2026.

Key takeaways

  • Most AI visibility APIs return paginated results with limits between 100-1000 records per request -- pulling 6 months of citation history requires a structured loop, not a single call
  • Rate limits vary significantly by platform and plan tier; exceeding them silently truncates data rather than throwing errors in many implementations
  • Citation data across AI models (ChatGPT, Perplexity, Claude, etc.) has different schemas and temporal resolutions -- normalize before aggregating
  • Cursor-based pagination is more reliable than offset-based for large historical pulls because it handles concurrent writes without skipping records
  • Tools like Promptwatch expose historical citation data via API with built-in pagination support, making bulk exports significantly less painful

Pulling six months of citation history from an AI visibility API sounds like a five-minute task. You write a loop, call the endpoint, dump the results. Done.

Then you hit record 1,001 and the response comes back empty. Or you get a 429 after 40 requests. Or you realize the data from February is structured differently from the data in August because the platform updated its schema mid-year. Now you have a half-built export, a frustrated data team, and a Monday standup you're not looking forward to.

This guide is for the people who've been there, or who want to avoid being there. We'll cover how AI visibility APIs handle pagination, what rate limits actually look like in practice, how to structure a bulk historical pull, and what to watch out for when you're working with citation data across multiple AI models.


Why historical citation data is harder to pull than it looks

AI visibility platforms track citations differently from traditional rank trackers. A rank tracker records a position once a day per keyword. An AI visibility platform might run the same prompt across 10 different models, capture the full response, extract cited URLs, score sentiment, and log all of it -- multiple times per day.

That's a lot of rows. Six months of data across 150 prompts and 10 models, running twice daily, is roughly 540,000 response records before you even start counting individual citation URLs within each response.

Most APIs aren't designed to hand you 540,000 records in one shot. They're designed to serve dashboards, which need the last 30 days of data for a handful of prompts. When you try to use them for bulk historical exports, you run into three problems:

  1. Pagination limits that weren't designed for bulk use
  2. Rate limits that throttle you before you finish
  3. Schema inconsistencies in older data that break your parser

Each of these is solvable, but you need to know they're coming.


Understanding pagination in AI visibility APIs

Offset vs. cursor-based pagination

Most APIs use one of two pagination approaches, and the choice matters a lot for bulk historical pulls.

Offset-based pagination looks like this:

GET /citations?start_date=2024-10-01&end_date=2025-04-01&limit=100&offset=0
GET /citations?start_date=2024-10-01&end_date=2025-04-01&limit=100&offset=100
GET /citations?start_date=2024-10-01&end_date=2025-04-01&limit=100&offset=200

It's simple to implement, but it has a real problem: if new records are written to the database while you're paginating, your offsets shift. You end up with duplicate records or gaps. For a 6-month historical pull, this is less of an issue than for live data, but it still causes problems if the platform is backfilling or correcting historical data (which happens more often than you'd think).

Cursor-based pagination is more robust:

GET /citations?start_date=2024-10-01&limit=100
# Response includes: "next_cursor": "eyJpZCI6MTAwMX0="

GET /citations?cursor=eyJpZCI6MTAwMX0=&limit=100
# Response includes: "next_cursor": "eyJpZCI6MjAwMX0="

The cursor encodes your position in the dataset. Even if new records appear, your cursor stays anchored to where you were. This is the approach you want for large pulls. If the API you're working with supports both, use cursors.

Page size limits

Most AI visibility APIs cap page size somewhere between 100 and 1,000 records. The practical ceiling is usually lower than the documented maximum because large responses time out on the server side before they return.

A safe default is 250 records per page. Test with 500 and see if you get consistent response times. If responses start taking more than 3-4 seconds, drop back to 250.

import requests
import time

def pull_citation_history(api_key, start_date, end_date, page_size=250):
    base_url = "https://api.yourplatform.com/v1/citations"
    headers = {"Authorization": f"Bearer {api_key}"}
    
    all_records = []
    cursor = None
    
    while True:
        params = {
            "start_date": start_date,
            "end_date": end_date,
            "limit": page_size
        }
        if cursor:
            params["cursor"] = cursor
        
        response = requests.get(base_url, headers=headers, params=params)
        
        if response.status_code == 429:
            # Rate limited -- wait and retry
            retry_after = int(response.headers.get("Retry-After", 60))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            continue
        
        data = response.json()
        all_records.extend(data["results"])
        
        cursor = data.get("next_cursor")
        if not cursor:
            break
        
        # Polite delay between requests
        time.sleep(0.5)
    
    return all_records

Handling rate limits without losing data

What rate limits actually look like

Rate limits on AI visibility APIs are usually expressed as requests per minute (RPM) or requests per day. The numbers vary by plan tier:

Plan tierTypical RPMDaily limitNotes
Free / trial10-20500-1,000Often undocumented
Essential / Starter30-605,000-10,000Usually documented
Professional60-12025,000-50,000Sometimes burst allowances
Business / Enterprise120-300+Unlimited or very highSLA-backed

The tricky part: many platforms don't return a clean 429 error when you hit limits. Some return 200 with an empty results array. Some return partial results. Some silently queue your request and return stale data. Always validate that your response contains the expected number of records, not just that the status code is 200.

The exponential backoff pattern

When you do hit a 429, don't just wait the Retry-After value and hammer the API again. Use exponential backoff with jitter:

import random

def request_with_backoff(url, headers, params, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers, params=params)
        
        if response.status_code == 200:
            return response
        
        if response.status_code == 429:
            base_wait = 2 ** attempt  # 1, 2, 4, 8, 16 seconds
            jitter = random.uniform(0, base_wait * 0.1)
            wait_time = base_wait + jitter
            print(f"Attempt {attempt + 1} failed. Waiting {wait_time:.1f}s")
            time.sleep(wait_time)
        else:
            # Non-rate-limit error -- raise immediately
            response.raise_for_status()
    
    raise Exception(f"Max retries exceeded for {url}")

Chunking by time window

For a 6-month pull, don't try to pull all 180 days in one paginated loop. Break it into weekly or monthly chunks. This has two benefits: smaller result sets per chunk are less likely to hit timeouts, and if something fails mid-pull, you only need to re-pull the failed chunk rather than starting over.

from datetime import datetime, timedelta

def generate_date_chunks(start_date, end_date, chunk_days=7):
    """Generate weekly date ranges between start and end dates."""
    chunks = []
    current = start_date
    
    while current < end_date:
        chunk_end = min(current + timedelta(days=chunk_days), end_date)
        chunks.append((current, chunk_end))
        current = chunk_end
    
    return chunks

# Usage
start = datetime(2024, 10, 1)
end = datetime(2025, 4, 1)
chunks = generate_date_chunks(start, end, chunk_days=14)

all_data = []
for chunk_start, chunk_end in chunks:
    print(f"Pulling {chunk_start.date()} to {chunk_end.date()}...")
    chunk_data = pull_citation_history(
        api_key=API_KEY,
        start_date=chunk_start.isoformat(),
        end_date=chunk_end.isoformat()
    )
    all_data.extend(chunk_data)
    time.sleep(2)  # Pause between chunks

Normalizing citation data across AI models

This is the part most guides skip, and it's where bulk historical pulls get messy.

Different AI models structure citations differently. Perplexity returns numbered source URLs inline. ChatGPT (with web browsing) returns citations with titles and snippets. Google AI Overviews return structured source cards. Claude's citation format changed twice in 2024 alone.

When you pull 6 months of data from a platform that monitors multiple models, you're going to get heterogeneous schemas. A record from October 2024 might have source_url as a field. A record from February 2025 might have cited_url instead. Both mean the same thing.

Build a normalization layer before you do any analysis:

def normalize_citation_record(raw_record):
    """Normalize citation records from different AI models and time periods."""
    
    normalized = {
        "id": raw_record.get("id") or raw_record.get("record_id"),
        "timestamp": raw_record.get("timestamp") or raw_record.get("created_at"),
        "model": raw_record.get("model") or raw_record.get("ai_engine"),
        "prompt": raw_record.get("prompt") or raw_record.get("query"),
        "cited_url": (
            raw_record.get("cited_url") or 
            raw_record.get("source_url") or 
            raw_record.get("citation_url")
        ),
        "brand_mentioned": raw_record.get("brand_mentioned", False),
        "sentiment": raw_record.get("sentiment") or raw_record.get("tone"),
        "position": raw_record.get("position") or raw_record.get("rank"),
    }
    
    # Handle nested citation objects (common in newer API versions)
    if "citation" in raw_record and isinstance(raw_record["citation"], dict):
        normalized["cited_url"] = raw_record["citation"].get("url")
    
    return normalized

The temporal instability problem

One thing worth knowing before you start analyzing 6 months of citation data: the Passionfruit research team reviewed 2.2M+ ChatGPT responses and found that citation patterns are temporally unstable. The same prompt run on the same model can return completely different citations on different days. SparkToro found the odds of getting the same brand list twice from ChatGPT at less than 1 in 100.

This means your historical data isn't a clean time series of "position over time." It's a distribution of outcomes. When you analyze it, think in terms of citation frequency (how often your brand appears across all runs of a prompt) rather than position (where you ranked on a specific day).

Why AI citations might not be the best visibility metric


Structuring your data pipeline for a 6-month pull

Here's a practical architecture for a one-time historical export:

Step 1: Inventory your prompts and models

Before you write a single API call, know what you're pulling. List every prompt you're tracking, every model, and the date range. Calculate the approximate record count:

Prompts × Models × Days × Runs per day = Total records
150 × 10 × 180 × 2 = 540,000 records

This tells you roughly how long the pull will take and whether you need to worry about storage.

Step 2: Set up checkpointing

For a pull this size, you will get interrupted. A checkpoint file saves your progress so you can resume without starting over:

import json
import os

CHECKPOINT_FILE = "citation_pull_checkpoint.json"

def load_checkpoint():
    if os.path.exists(CHECKPOINT_FILE):
        with open(CHECKPOINT_FILE, "r") as f:
            return json.load(f)
    return {"completed_chunks": [], "total_records": 0}

def save_checkpoint(checkpoint):
    with open(CHECKPOINT_FILE, "w") as f:
        json.dump(checkpoint, f)

# In your main loop
checkpoint = load_checkpoint()

for chunk_start, chunk_end in chunks:
    chunk_key = f"{chunk_start.date()}_{chunk_end.date()}"
    
    if chunk_key in checkpoint["completed_chunks"]:
        print(f"Skipping {chunk_key} (already pulled)")
        continue
    
    chunk_data = pull_citation_history(...)
    save_to_storage(chunk_data)
    
    checkpoint["completed_chunks"].append(chunk_key)
    checkpoint["total_records"] += len(chunk_data)
    save_checkpoint(checkpoint)

Step 3: Write to a local buffer, not directly to your database

Don't write records directly to your production database as they come in. Write to local JSON files or a SQLite database first, then do a single bulk import when the pull is complete. This is faster and lets you validate the data before it touches production.

Step 4: Validate before you analyze

Once the pull is complete, run basic validation:

  • Total record count matches your estimate (within 10-15% is fine; citation data has natural variance)
  • No date gaps longer than 3 days (unless you know the platform was down)
  • All expected models are represented
  • Citation URL field is populated in at least 80% of records

Platform-specific considerations

Different AI visibility platforms handle bulk exports differently. Here's what to expect from the major ones:

PlatformPagination typeMax page sizeHistorical depthAPI availability
PromptwatchCursor-based500Full historyAll paid plans
ProfoundOffset-based10090 days on lower tiersBusiness+
Otterly.AIOffset-based5030 daysLimited
Peec.aiCursor-based20060 daysPro+
AthenaHQOffset-based10090 daysEnterprise

Promptwatch is worth calling out specifically here because it's one of the few platforms that gives you full historical depth on paid plans (not just 30 or 90 days), and its API uses cursor-based pagination which makes bulk pulls significantly more reliable. If you're building a data pipeline that needs to pull citation history regularly, the pagination design matters as much as the data itself.

Favicon of Promptwatch

Promptwatch

Track and optimize your brand visibility in AI search engines
View more
Screenshot of Promptwatch website

For platforms using offset-based pagination with small page sizes (like Otterly.AI at 50 records per page), a 6-month pull with 150 prompts means roughly 10,000+ API calls. At typical rate limits, that's a multi-hour job. Plan accordingly.

Favicon of Otterly.AI

Otterly.AI

AI search monitoring platform tracking brand mentions across ChatGPT, Perplexity, and Google AI Overviews
View more
Screenshot of Otterly.AI website
Favicon of Profound

Profound

Enterprise AI visibility platform tracking brand mentions across ChatGPT, Perplexity, and 9+ AI search engines
View more
Screenshot of Profound website

Common failure modes and how to fix them

Empty pages that aren't actually the end

Some APIs return an empty results array when you hit a rate limit or a server error, rather than returning an error code. Your pagination loop sees an empty page and thinks it's done. You end up with a fraction of your data.

Fix: Always check the total count returned in the first response and compare it to how many records you actually pulled.

# First request -- capture total
first_response = requests.get(url, headers=headers, params={**params, "limit": 1})
total_expected = first_response.json().get("total_count", 0)

# After full pull
if len(all_records) < total_expected * 0.9:
    print(f"WARNING: Expected ~{total_expected} records, got {len(all_records)}")

Schema changes breaking your parser

AI visibility is a young category. Platforms update their APIs frequently. A field that was brand_mentioned: true/false in Q4 2024 might be visibility_score: 0-100 in Q1 2025.

Fix: Use a flexible parser that logs unknown fields rather than crashing on them. Review your normalization layer every time you do a fresh pull.

Duplicate records from overlapping date ranges

If your chunks overlap by even one second, you'll get duplicate records at the boundaries. Use half-open intervals: [start, end) where the end date is exclusive.

# Correct: end date is exclusive
params = {
    "start_date": "2024-10-01T00:00:00Z",
    "end_date": "2024-10-08T00:00:00Z"  # Not included
}

What to do with 6 months of citation data once you have it

The point of pulling historical data isn't the pull itself. A few analyses that are actually worth running:

Citation frequency trend: For each prompt, what percentage of responses cited your brand each month? A declining trend is an early warning sign. A sudden spike usually correlates with a piece of content you published or earned media coverage.

Model-by-model breakdown: Your citation rate on Perplexity and your citation rate on ChatGPT are almost certainly different. Treating them as one number hides the story. Break them out.

Competitor gap analysis: Which prompts are your competitors cited for that you're not? Six months of data gives you enough signal to identify structural gaps rather than noise.

Content correlation: If you have a list of content publication dates, overlay them against citation frequency. This is the closest thing to a causal test you can run without a controlled experiment.

For the competitor gap analysis piece specifically, tools like Promptwatch have this built in as an Answer Gap Analysis feature -- it surfaces exactly which prompts competitors are visible for that you're not, without requiring you to build the analysis yourself from raw API data.


A note on what citation data can and can't tell you

Before you build a whole reporting system around 6 months of citation history, it's worth being honest about what the data represents.

Citation frequency is a useful signal, but it's not a clean metric. The Passionfruit research cited earlier found that 85% of content AI retrieves is never actually shown to users, and that being cited is not the same as being recommended. A citation in a list of sources is different from a direct recommendation.

That doesn't mean the data is worthless -- it means you should pair it with traffic attribution data to understand which citations are actually driving visits and conversions. Server log analysis, UTM tracking, and GSC integration can all help close that loop.

The goal isn't to maximize citation count. It's to be cited in the right prompts, by the right models, in a way that drives qualified traffic. Six months of historical data helps you understand where you've been. What you do with that understanding is what matters.

Share: