How to track AI traffic to your website: referrals and crawlers

2026-06-04

AI traffic to your site comes in two forms, and most analytics tools only catch one of them. The first is referral traffic, where a person clicks through to your site from an AI answer in ChatGPT, Claude, or Perplexity. The second is crawler traffic, where a bot like GPTBot or ClaudeBot fetches your pages to train or ground a model. They land in completely different places, and a tool that catches one can be blind to the other.

I went looking for how much of either lodd.dev was getting, and the honest answer turned out to depend entirely on which one I meant. One of them my own tracker could see straight away. The other it could not see at all, for a reason that's worth understanding before you trust any analytics tool that claims to show you AI traffic.

Referral traffic and crawler traffic are different problems

Referral traffic and crawler traffic need different tracking because they reach your site in different ways. A referral is a real person in a real browser: they read an answer in Claude, click your link, and their browser loads your page and runs whatever scripts are on it. The referrer header says claude.ai, so a normal analytics script sees the visit like any other.

A crawler is not a browser. GPTBot, ClaudeBot, and PerplexityBot send an HTTP request, read the raw HTML, follow links, and move on. They don't render the page or run scripts, because running a full browser engine for every page would be slow and expensive at crawl scale. So the same client-side script that catches your referral traffic never fires for them. If you only have a JavaScript tracker, you're seeing one half of your AI traffic and missing the other entirely.

How do you see which AI tools send you visitors?

To see which AI tools send you visitors, filter your traffic sources by referrer for the AI domains: chatgpt.com, claude.ai, perplexity.ai, and gemini.google.com. This is the referral half, and it works out of the box with the standard tracking script, because the visitor is a real browser carrying a real referrer.

Because Lodd is built for agents, the natural way to ask is in plain language. In Claude Code or any MCP client, you'd say:

How much traffic did claude.ai and chatgpt.com send me
this month compared to last month?

The agent calls get_traffic_sources with a referrer filter for each period and tells you whether AI referrals are growing. If you want the full walkthrough of querying traffic this way, the setup for Claude Code covers it. The point is that you're not opening a dashboard to read a number, you're asking the agent that's already in your terminal, and it has the traffic context to answer.

Why your JavaScript tracker can't see GPTBot

Your JavaScript tracker can't see GPTBot because GPTBot doesn't run JavaScript. The tracking script only records a hit when it executes in a browser and sends a beacon back. A crawler fetches your HTML and never runs the script, so no beacon is ever sent, and the visit leaves no trace in client-side analytics. It doesn't matter how good the tool's bot detection is, because detection only runs on requests that actually reach it.

This is the part that catches people out. You add an analytics tool, check the bot report, see almost nothing, and conclude the AI crawlers aren't visiting. They almost certainly are. They're just hitting your server and leaving before any script runs. To see them, you have to look in the one place they can't avoid: the server itself, where every request arrives with its User-Agent attached.

How do you track AI crawlers like GPTBot and ClaudeBot?

To track AI crawlers, you capture requests on the server, where the User-Agent is visible, and forward them to your analytics. With the @lodd/node SDK, you record a server-side page view and pass the request's User-Agent. The same bot detection that runs on the browser path tags crawlers as bots, so they show up in your bot report instead of polluting your human numbers:

import { Lodd } from "@lodd/node";

const lodd = new Lodd({
  apiKey: process.env.LODD_API_KEY,
  siteId: process.env.LODD_SITE_ID,
});

// In your request handler or middleware:
lodd.pageview({
  url: req.path,
  userAgent: req.headers["user-agent"],
});

If you're on Express, the middleware does it for you, and forwards crawler requests only so you don't double-count humans already covered by the browser script:

import { caExpress } from "@lodd/node/express";

app.use(caExpress(lodd, { pageviews: true }));

One honest limit: this needs a server. If your site is static and served from a CDN, the crawler is answered by the edge before any application code runs, so there's nothing to capture the request. You'd add the same forwarding in an edge middleware instead, which is exactly how lodd.dev tracks its own crawlers. And because crawler traffic isn't something you asked for, bot page views don't count toward your event quota.

What this tells you about your AEO

Together, the two signals tell you whether your AEO is actually working. Crawler traffic tells you whether AI engines are indexing you at all, which is the precondition for ever being cited. Referral traffic tells you whether those citations turn into real visits, which is the part that matters to your business. Seeing GPTBot crawl you but no referrals arriving is a different problem from never being crawled in the first place, and you can only tell them apart if you measure both.

This is a new capability, and I'm still building up lodd.dev's own numbers, so I won't pretend to have a tidy case study yet. What I can say is that the question "is being cited by AI sending me anything" stopped being unanswerable the moment both halves were in one place. Your agent can pull crawler counts and referral counts in the same conversation and reason about the gap between them, which is closer to a useful answer than any single number on a dashboard. If you want the technical detail on server-side tracking, the docs cover it, and the Google Search Console setup pairs search visibility with the same traffic data.