Question 1

What is Token Telemetry?

Accepted Answer

Token Telemetry (also written TokenTelemetry, sometimes misspelled as 'token telementry' or 'tokentelementry') is a free, open-source, 100% local observability dashboard for AI coding agents like Claude Code, Codex, Gemini CLI, Cursor, and GitHub Copilot. It tracks tokens, cost, tool calls, and reasoning by reading the log files those agents already write — no SDK, no signup, no cloud.

Question 2

How do I track Claude Code token usage?

Accepted Answer

Install TokenTelemetry, run Claude Code normally, and open http://localhost:3000. TokenTelemetry auto-detects Claude Code sessions from ~/.claude/ logs — no instrumentation, no SDK, no config.

Question 3

How do I monitor Google Antigravity, Codex, and Gemini CLI costs?

Accepted Answer

TokenTelemetry auto-reads logs from Google Antigravity (Google's agentic coding CLI), OpenAI Codex CLI, Gemini CLI, Cursor, GitHub Copilot, Qwen CLI, OpenCode, Vibe, and Grok Build (xAI). Token counts and dollar costs appear in the local dashboard automatically.

Question 4

Is there a free tool to monitor AI coding agent token usage?

Accepted Answer

Yes — TokenTelemetry is free, open-source (MIT), and runs 100% locally. No account, no signup, no cloud.

Question 5

Does TokenTelemetry send my data to the cloud?

Accepted Answer

Your logs, sessions, prompts, tokens, and costs never leave your computer — the dashboard reads local files and serves a UI on localhost. The app does send anonymous, content-free usage stats (which pages and features you use — never your code, prompts, paths, or costs) so we know what to improve; it's on by default and you can see the exact payload and turn it off in Settings → Usage & privacy, or with DO_NOT_TRACK=1. There's also an optional GitHub update check (no usage data); disable with TT_NO_UPDATE_CHECK=1.

Question 6

How does TokenTelemetry compare to Langfuse or Helicone?

Accepted Answer

TokenTelemetry is purpose-built for AI coding agents and is zero-config — no SDK instrumentation. Langfuse and Helicone are general LLM-app observability platforms that require code changes and (typically) a cloud account.

Question 7

Which agents does it support?

Accepted Answer

Ten coding agents (Claude Code, OpenAI Codex, Gemini CLI, Cursor, GitHub Copilot, Qwen CLI, OpenCode, Vibe, Antigravity, Grok Build) plus Hermes Agent — Nous Research's autonomous agent, which gets its own dedicated dashboard at /hermes with gateway health, scheduled-job monitoring, skills + memory observability, and 38 source platforms (CLI / Telegram / Discord / Feishu / DingTalk / cron / webhook / …).

Question 8

Why does Hermes Agent get its own page?

Accepted Answer

Hermes is structurally different from coding agents — it runs across messaging platforms (Telegram / Discord / Slack / WhatsApp / Signal / Matrix / Feishu / DingTalk / WeChat), supports persistent skills and memory, delegates to subagents, and runs scheduled cron jobs. Forcing it into the same UI as Claude Code would hide most of what it does, so it gets a dedicated surface that respects its shape.

Question 9

Can I use TokenTelemetry from inside Hermes Dashboard?

Accepted Answer

Yes — there's a Hermes Dashboard plugin that registers a 'TokenTelemetry' tab inside Hermes's web UI at port 9119. It's a thin launcher: deep-link cards open the relevant TokenTelemetry page (Hermes Overview, Skills, Memory, Analytics, Projects) in a new browser tab, so you don't have to remember a second port. Install with `./scripts/install-hermes-plugin.sh` from the TokenTelemetry repo, then run `hermes dashboard`.

Section	What it contains
What	A one-sentence description of what the session accomplished
Tools	Which tools were called and how many times
Why	The inferred goal or task context
Next	Suggested follow-up steps or open questions

Field	Description
Title	Short category label (e.g. "API key invalid")
Message	Plain-English explanation
Hint	Actionable step (e.g. "Run `claude login`")
Show raw error	Disclosure triangle revealing the truncated raw output for bug reports

Category	Trigger	Hint
`auth`	HTTP 401, invalid key	Backend-specific login command or env var
`too_large`	HTTP 413, context overflow	Use a model with a larger context window
`quota`	HTTP 429, rate limit	Wait and retry, or pick a cheaper model
`model`	Model not found	Pick a model your account can access
`timeout`	Request timed out	Use a faster model or increase `TT_<BACKEND>_TIMEOUT`
`network`	Connection refused	Check the backend is running (e.g. `ollama serve`)
`no_output`	Empty response	Try a different model or regenerate
`unknown`	Anything else	Shows the provider's message, with the raw error available

Summarization

Two summary styles

Structured brief (deterministic)

LLM narrative

Generate and Regenerate

Caching by content hash

Configuring a backend

Error cards

Tips

On this page