AW
AgentWatch
Documentation

AgentWatch Docs

Everything you need to start monitoring your AI agent costs in minutes.

Quickstart guide

Get from zero to tracked in under 5 minutes. Pick your provider below — OpenAI, Anthropic, Gemini, Groq, Mistral, or Ollama. For no-SDK setups, see the webhook integration section.

One-line SDK wrapper. Tracking is fully automatic — no changes to your existing calls.

1

Install the SDK

npm install @itsmemyan/agentwatch
2

Wrap your OpenAI client

Get your API key from Dashboard → Integrations and your agent ID from the agent detail page.

import OpenAI from "openai";
import { wrap } from "@itsmemyan/agentwatch";

// Wrap once — e.g. in lib/openai.ts
export const openai = wrap(new OpenAI(), {
  apiKey: process.env.AGENTWATCH_API_KEY!,
  agentId: "your-agent-id",   // find this in your dashboard
});
3

Use your client as normal

No code changes needed — the wrapper intercepts every call automatically.

// Use openai exactly as before — tracking is automatic
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: userMessage }],
});

// Streaming works too — usage captured from final chunk
const stream = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: userMessage }],
  stream: true,
});
4

View your agent in the dashboard

Within seconds of your first tracked request, your agent appears in the dashboard with real-time cost graphs, request counts, and latency.

5

Set a monthly budget

Go to your agent settings and set a monthly spend limit. AgentWatch alerts you at 50%, 80%, and 100% — so you're never caught off-guard.

4

View your agent in the dashboard

Within seconds of your first tracked request, your agent will appear in the dashboard. You'll see real-time cost graphs, request counts, latency histograms, and error rates.

5

Set a monthly budget

Go to your agent's settings and set a monthly spend limit. AgentWatch will email you at 50%, 80%, and 100% of your budget — so you're never caught off-guard.

Webhook integration

If you don't want to use the SDK, send events directly via HTTP. Works with any language, framework, or automation tool.

Endpoint

POST https://getagentwatch.com/api/v1/ingest
// Works with any provider — call your LLM, then send token counts here.
// Install: npm install node-fetch (or use native fetch in Node 18+)

async function trackAgentRequest({
  agentId,
  model,
  promptTokens,
  completionTokens,
  latencyMs,
  success,
  errorMessage,
}) {
  await fetch("https://getagentwatch.com/api/v1/ingest", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": "Bearer YOUR_API_KEY",
    },
    body: JSON.stringify({
      agent_id: agentId,
      model,
      prompt_tokens: promptTokens,
      completion_tokens: completionTokens,
      latency_ms: latencyMs,
      success,
      error_message: errorMessage ?? null,
      timestamp: new Date().toISOString(),
    }),
  });
}

// See "Provider token extraction" below for how to get promptTokens /
// completionTokens from OpenAI, Anthropic, Gemini, Groq, Mistral, or Ollama.

Provider token extraction

Each provider exposes token counts in a slightly different field. Pick yours to see the exact field names to pass to trackAgentRequest.

// Node.js / TypeScript
const start = Date.now();
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: userMessage }],
});
await trackAgentRequest({
  agentId: "support-bot",
  model: "gpt-4o",
  promptTokens: response.usage.prompt_tokens,
  completionTokens: response.usage.completion_tokens,
  latencyMs: Date.now() - start,
  success: true,
});

# Python
start = time.time()
response = client.chat.completions.create(model="gpt-4o", messages=[...])
track_agent_request(
    agent_id="support-bot",
    model="gpt-4o",
    prompt_tokens=response.usage.prompt_tokens,
    completion_tokens=response.usage.completion_tokens,
    latency_ms=int((time.time() - start) * 1000),
    success=True,
)

No-code setup (n8n / Make.com)

If you build agents in n8n or Make.com, add an HTTP Request node as the last step of your workflow:

  • MethodPOST
  • URLhttps://getagentwatch.com/api/v1/ingest
  • Auth headerAuthorization: Bearer YOUR_API_KEY
  • BodyJSON — see payload schema below

API Reference

The AgentWatch REST API accepts JSON and returns JSON. All requests must include your API key in the Authorization header.

POST/api/v1/ingest

Ingests a single agent request event. Call this after every LLM API call you want to track.

Request body

{
  "agent_id": "your-agent-id-here",   // string, required — unique slug for your agent
  "model": "gpt-4o",                   // string, required — model name
  "prompt_tokens": 450,                // number, required — input token count
  "completion_tokens": 120,            // number, required — output token count
  "latency_ms": 834,                   // number, required — wall-clock time in ms
  "success": true,                     // boolean, required
  "error_message": null,               // string | null, optional
  "timestamp": "2026-04-13T10:23:00Z", // ISO-8601 string, optional (defaults to now)
  "metadata": {                        // object, optional — break down cost by any dimension
    "user_id": "usr_123",
    "task_type": "summarize",
    "session_id": "sess_abc"
  },
  "version": "v2.1.0"                 // string, optional — track cost per deploy / git SHA
}
FieldTypeRequiredDescription
agent_idstringYesUnique identifier for your agent. Use a consistent slug like support-bot.
modelstringYesModel name, e.g. gpt-4o, claude-3-5-sonnet-20241022.
prompt_tokensnumberYesNumber of input (prompt) tokens used.
completion_tokensnumberYesNumber of output (completion) tokens generated.
latency_msnumberYesTotal wall-clock time in milliseconds.
successbooleanYesWhether the request succeeded.
error_messagestring | nullNoError message if success is false.
timestampISO-8601 stringNoDefaults to the time the request is received.
metadataobjectNoArbitrary key-value pairs: user_id, task_type, session_id, etc. Used for cost breakdown by dimension.
versionstringNoDeploy version or git SHA. Enables cost-per-deploy tracking on the agent detail page.

Response

200 OKEvent accepted.
{ "ok": true, "event_id": "evt_01HXYZ..." }
400 Bad RequestMissing or invalid fields.
{ "ok": false, "error": "prompt_tokens is required" }
401 UnauthorizedInvalid or missing API key.
{ "ok": false, "error": "Invalid API key" }
429 Too Many RequestsRate limit exceeded (1,000 req/min).
{ "ok": false, "error": "Rate limit exceeded" }

Metadata & deploy versions

Pass a metadata object with each event to break down costs by user, task type, or any dimension. Pass version to see how costs change across deploys. Both appear in the agent detail page.

// Global metadata — applied to every event from this client
const openai = wrap(new OpenAI(), {
  apiKey: process.env.AGENTWATCH_API_KEY!,
  agentId: "support-bot",
  version: process.env.DEPLOY_SHA,   // git SHA or semver tag
  metadata: { service: "support" },  // static labels
});

// Per-request metadata — merged with global metadata
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: userMessage }],
  _aw: { metadata: { user_id: req.user.id, plan: "growth" } },
} as any);

// Dashboard will show cost broken down by user_id, plan, service

API v2 — Query your data

Business plan

Business plan users can query their AgentWatch data programmatically — useful for building internal dashboards, feeding BI tools, or automating reports. Authenticated via your existing API key. Rate limit: 60 requests/min.

EndpointDescription
GET /api/v2/agentsList all your agents
GET /api/v2/stats?days=30Dashboard-level spend + model breakdown
GET /api/v2/agents/:id/eventsPaginated event log for one agent
GET /api/v2/agents/:id/statsPer-agent spend, model breakdown, error rate
curl https://getagentwatch.com/api/v2/agents \
  -H "X-API-Key: YOUR_API_KEY"

# Response:
# { "agents": [{ "id": "...", "name": "support-bot", "provider": "openai", ... }] }

Supported models

AgentWatch knows the per-token pricing for all major models. Pass any of these in the model field and costs are calculated automatically.

ModelProviderInput (per 1M tokens)Output (per 1M tokens)
gpt-4oOpenAI$2.50$10.00
gpt-4o-miniOpenAI$0.15$0.60
o3OpenAI$10.00$40.00
o4-miniOpenAI$1.10$4.40
claude-sonnet-4-6Anthropic$3.00$15.00
claude-haiku-4-5Anthropic$0.80$4.00
claude-opus-4-6Anthropic$15.00$75.00
claude-3-5-sonnet-20241022Anthropic$3.00$15.00
claude-3-5-haiku-20241022Anthropic$0.80$4.00
gemini-2.0-flashGoogle$0.10$0.40
gemini-2.0-flash-liteGoogle$0.075$0.30
gemini-1.5-proGoogle$3.50$10.50
gemini-1.5-flashGoogle$0.075$0.30
llama-3.3-70b-versatileGroq$0.59$0.79
llama-3.1-8b-instantGroq$0.05$0.08
mixtral-8x7b-32768Groq$0.24$0.24
mistral-large-latestMistral$3.00$9.00
mistral-small-latestMistral$0.20$0.60
codestral-latestMistral$0.20$0.60

Custom/unknown models are tracked with token counts only; costs show as “unknown” until you configure a custom price in agent settings.

Need help?

Can't find what you're looking for? Email us at support@getagentwatch.com and we'll get back to you within one business day.