Cost Management

Agent Token Monitor

Track token consumption, manage budgets, and receive cost alerts across all your AI agents and providers. Token Monitor gives you real-time visibility into spending across OpenAI, Anthropic, Google, Mistral, and any custom model endpoint.

Supported Providers

Token monitoring supports all major LLM providers out of the box.

OpenAI

GPT-4, GPT-5, o-series models

Anthropic

Claude Opus, Sonnet, Haiku models

Google

Gemini Pro, Flash, Ultra models

Mistral

Mistral Large, Medium, Small models

DeepSeek

Reasoner, Coder models

Custom

Self-hosted or custom model endpoints

Recording Token Usage

Report token usage after each model call. DRD automatically calculates costs based on the provider's current pricing. For custom models, configure pricing in your workspace settings.

POST /v1/token-usage

curl -X POST https://api.drd.io/v1/token-usage \
  -H "Authorization: Bearer drd_ws_..." \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "01956abc-def0...",
    "provider": "anthropic",
    "model": "claude-opus-4-6",
    "usage": {
      "inputTokens": 2450,
      "outputTokens": 890,
      "cachedTokens": 1200,
      "reasoningTokens": 0
    },
    "metadata": {
      "requestId": "req_abc123",
      "taskType": "content-analysis",
      "project": "marketing-compliance"
    },
    "timestamp": "2026-02-14T10:30:00Z"
  }'

201 Created

{
  "id": "usage_01xyz...",
  "agentId": "01956abc-def0...",
  "provider": "anthropic",
  "model": "claude-opus-4-6",
  "usage": {
    "inputTokens": 2450,
    "outputTokens": 890,
    "cachedTokens": 1200,
    "totalTokens": 3340
  },
  "cost": {
    "inputCost": 0.03675,
    "outputCost": 0.06675,
    "cachedDiscount": -0.009,
    "totalCost": 0.0945,
    "currency": "USD"
  },
  "budget": {
    "dailyUsed": 12.45,
    "dailyLimit": 100.00,
    "monthlyUsed": 234.67,
    "monthlyLimit": 5000.00
  }
}

SDK: Automatic Token Tracking

Use the SDK to record token usage programmatically or wrap your provider clients for automatic tracking.

token-tracking.ts

import { DRDClient } from "@drd/sdk";

const drd = new DRDClient({ apiKey: process.env.DRD_API_KEY! });

// Record token usage
const usage = await drd.tokens.record({
  agentId: "01956abc-def0...",
  provider: "openai",
  model: "gpt-5.2",
  usage: {
    inputTokens: 1500,
    outputTokens: 420,
    cachedTokens: 0,
    reasoningTokens: 300,
  },
  metadata: {
    taskType: "summarization",
    project: "content-pipeline",
  },
});

console.log("Cost:", usage.cost.totalCost);
console.log("Daily budget:", usage.budget.dailyUsed, "/", usage.budget.dailyLimit);

// Use the middleware for automatic tracking
import { createTokenMiddleware } from "@drd/sdk/middleware";

const tokenMiddleware = createTokenMiddleware({
  drdClient: drd,
  agentId: "01956abc-def0...",
  defaultMetadata: { project: "content-pipeline" },
});

// Wrap your OpenAI/Anthropic client
const openai = tokenMiddleware.wrapOpenAI(originalOpenAIClient);
const anthropic = tokenMiddleware.wrapAnthropic(originalAnthropicClient);

// Now all calls are automatically tracked
const response = await openai.chat.completions.create({
  model: "gpt-5.2",
  messages: [{ role: "user", content: "Summarize this article..." }],
});
// Token usage is automatically reported to DRD

Budget Management

Set spending limits at the workspace, project, agent, or model level. Budgets can be configured as daily, weekly, or monthly limits with configurable enforcement actions when thresholds are reached.

budget-config.ts

import { DRD } from '@drd/sdk';

const drd = new DRD({ apiKey: process.env.DRD_API_KEY });

// Set a budget for an agent
const budget = await drd.tokens.budgets.create({
  scope: "agent",
  scopeId: "01956abc-def0...",
  limits: {
    daily: 50.00,       // $50/day
    weekly: 250.00,     // $250/week
    monthly: 1000.00,   // $1,000/month
  },
  thresholds: [
    {
      percentage: 80,
      action: "notify",
      channels: ["email", "slack"],
    },
    {
      percentage: 95,
      action: "notify",
      channels: ["email", "slack", "pagerduty"],
    },
    {
      percentage: 100,
      action: "block",
      allowOverride: true,
    },
  ],
  currency: "USD",
});

// Set a budget per model
const modelBudget = await drd.tokens.budgets.create({
  scope: "model",
  scopeId: "gpt-5.2",
  limits: {
    monthly: 2000.00,
  },
  thresholds: [
    { percentage: 90, action: "notify", channels: ["slack"] },
    { percentage: 100, action: "throttle", rateLimit: 10 },
  ],
});

Enforcement: When a budget threshold triggers a block action, all subsequent token usage requests for that scope return a 429 Budget Exceeded response. Set allowOverride: true to allow workspace admins to bypass the block.

Cost Allocation

Allocate costs to projects, teams, or cost centers. Use metadata tags on token usage records to enable granular cost breakdowns.

GET /v1/token-usage/summary

curl "https://api.drd.io/v1/token-usage/summary?\
  period=monthly&\
  groupBy=project,model&\
  from=2026-01-01&\
  to=2026-01-31" \
  -H "Authorization: Bearer drd_ws_..."

// Response
{
  "period": { "from": "2026-01-01", "to": "2026-01-31" },
  "totalCost": 4523.67,
  "totalTokens": 48291034,
  "groups": [
    {
      "project": "content-pipeline",
      "model": "claude-opus-4-6",
      "inputTokens": 12500000,
      "outputTokens": 3200000,
      "cost": 1890.00,
      "requestCount": 8420
    },
    {
      "project": "content-pipeline",
      "model": "gpt-5.2",
      "inputTokens": 8900000,
      "outputTokens": 2100000,
      "cost": 1245.30,
      "requestCount": 5210
    },
    {
      "project": "marketing-compliance",
      "model": "gemini-3-pro",
      "inputTokens": 15000000,
      "outputTokens": 4200000,
      "cost": 988.37,
      "requestCount": 12400
    }
  ]
}

Usage Analytics

Every API call is logged with input/output token counts, cost, model, and provider. Query historical usage for analytics, trend analysis, and billing reconciliation.

analytics.ts

// Get usage trends over time
const trends = await drd.tokens.analytics({
  period: "daily",
  from: "2026-01-01",
  to: "2026-01-31",
  groupBy: "provider",
});

for (const day of trends.dataPoints) {
  console.log(day.date);
  for (const provider of day.groups) {
    console.log(`  ${provider.name}: ${provider.totalTokens} tokens, $${provider.cost}`);
  }
}

// Get top consumers
const topAgents = await drd.tokens.topConsumers({
  period: "monthly",
  metric: "cost",
  limit: 10,
});

for (const agent of topAgents) {
  console.log(`${agent.agentName}: $${agent.totalCost} (${agent.totalTokens} tokens)`);
}

// Get model efficiency metrics
const efficiency = await drd.tokens.efficiency({
  agentId: "01956abc-def0...",
  period: "weekly",
});

console.log("Avg input/output ratio:", efficiency.avgInputOutputRatio);
console.log("Cache hit rate:", efficiency.cacheHitRate);
console.log("Cost per task:", efficiency.avgCostPerTask);

Alerts

Configure alerts for anomalous usage patterns, budget thresholds, and cost spikes. Alerts can be delivered via email, Slack, PagerDuty, or webhook.

alerts.ts

// Create a spike detection alert
const alert = await drd.tokens.alerts.create({
  name: "Usage Spike Detection",
  type: "anomaly",
  config: {
    metric: "cost",
    window: "1h",
    threshold: 2.0,        // 2x normal baseline
    baselinePeriod: "7d",  // Compare against 7-day average
  },
  channels: [
    { type: "slack", webhookUrl: "https://hooks.slack.com/..." },
    { type: "email", to: ["ops@example.com"] },
  ],
  enabled: true,
});

// Create a daily summary alert
const dailySummary = await drd.tokens.alerts.create({
  name: "Daily Cost Summary",
  type: "scheduled",
  config: {
    schedule: "0 18 * * *",   // 6 PM daily
    timezone: "America/New_York",
    includeBreakdown: true,
    includeTopConsumers: true,
  },
  channels: [
    { type: "email", to: ["finance@example.com"] },
  ],
  enabled: true,
});

Next Steps

Agent Health

Monitor agent health scores

Learn more →

Monitoring

Real-time agent monitoring

Learn more →

Scheduled Tasks

Automate recurring operations

Learn more →

Trust Alerts

Alert on token consumption anomalies

Learn more →

Cost Management

Agent Token Monitor

Supported Providers

Token monitoring supports all major LLM providers out of the box.

OpenAI

GPT-4, GPT-5, o-series models

Anthropic

Claude Opus, Sonnet, Haiku models

Google

Gemini Pro, Flash, Ultra models

Mistral

Mistral Large, Medium, Small models

DeepSeek

Reasoner, Coder models

Custom

Self-hosted or custom model endpoints

Recording Token Usage

Report token usage after each model call. DRD automatically calculates costs based on the provider's current pricing. For custom models, configure pricing in your workspace settings.

POST /v1/token-usage

curl -X POST https://api.drd.io/v1/token-usage \
  -H "Authorization: Bearer drd_ws_..." \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "01956abc-def0...",
    "provider": "anthropic",
    "model": "claude-opus-4-6",
    "usage": {
      "inputTokens": 2450,
      "outputTokens": 890,
      "cachedTokens": 1200,
      "reasoningTokens": 0
    },
    "metadata": {
      "requestId": "req_abc123",
      "taskType": "content-analysis",
      "project": "marketing-compliance"
    },
    "timestamp": "2026-02-14T10:30:00Z"
  }'

201 Created

{
  "id": "usage_01xyz...",
  "agentId": "01956abc-def0...",
  "provider": "anthropic",
  "model": "claude-opus-4-6",
  "usage": {
    "inputTokens": 2450,
    "outputTokens": 890,
    "cachedTokens": 1200,
    "totalTokens": 3340
  },
  "cost": {
    "inputCost": 0.03675,
    "outputCost": 0.06675,
    "cachedDiscount": -0.009,
    "totalCost": 0.0945,
    "currency": "USD"
  },
  "budget": {
    "dailyUsed": 12.45,
    "dailyLimit": 100.00,
    "monthlyUsed": 234.67,
    "monthlyLimit": 5000.00
  }
}

SDK: Automatic Token Tracking

Use the SDK to record token usage programmatically or wrap your provider clients for automatic tracking.

token-tracking.ts

import { DRDClient } from "@drd/sdk";

const drd = new DRDClient({ apiKey: process.env.DRD_API_KEY! });

// Record token usage
const usage = await drd.tokens.record({
  agentId: "01956abc-def0...",
  provider: "openai",
  model: "gpt-5.2",
  usage: {
    inputTokens: 1500,
    outputTokens: 420,
    cachedTokens: 0,
    reasoningTokens: 300,
  },
  metadata: {
    taskType: "summarization",
    project: "content-pipeline",
  },
});

console.log("Cost:", usage.cost.totalCost);
console.log("Daily budget:", usage.budget.dailyUsed, "/", usage.budget.dailyLimit);

// Use the middleware for automatic tracking
import { createTokenMiddleware } from "@drd/sdk/middleware";

const tokenMiddleware = createTokenMiddleware({
  drdClient: drd,
  agentId: "01956abc-def0...",
  defaultMetadata: { project: "content-pipeline" },
});

// Wrap your OpenAI/Anthropic client
const openai = tokenMiddleware.wrapOpenAI(originalOpenAIClient);
const anthropic = tokenMiddleware.wrapAnthropic(originalAnthropicClient);

// Now all calls are automatically tracked
const response = await openai.chat.completions.create({
  model: "gpt-5.2",
  messages: [{ role: "user", content: "Summarize this article..." }],
});
// Token usage is automatically reported to DRD

Budget Management

Set spending limits at the workspace, project, agent, or model level. Budgets can be configured as daily, weekly, or monthly limits with configurable enforcement actions when thresholds are reached.

budget-config.ts

import { DRD } from '@drd/sdk';

const drd = new DRD({ apiKey: process.env.DRD_API_KEY });

// Set a budget for an agent
const budget = await drd.tokens.budgets.create({
  scope: "agent",
  scopeId: "01956abc-def0...",
  limits: {
    daily: 50.00,       // $50/day
    weekly: 250.00,     // $250/week
    monthly: 1000.00,   // $1,000/month
  },
  thresholds: [
    {
      percentage: 80,
      action: "notify",
      channels: ["email", "slack"],
    },
    {
      percentage: 95,
      action: "notify",
      channels: ["email", "slack", "pagerduty"],
    },
    {
      percentage: 100,
      action: "block",
      allowOverride: true,
    },
  ],
  currency: "USD",
});

// Set a budget per model
const modelBudget = await drd.tokens.budgets.create({
  scope: "model",
  scopeId: "gpt-5.2",
  limits: {
    monthly: 2000.00,
  },
  thresholds: [
    { percentage: 90, action: "notify", channels: ["slack"] },
    { percentage: 100, action: "throttle", rateLimit: 10 },
  ],
});

Cost Allocation

Allocate costs to projects, teams, or cost centers. Use metadata tags on token usage records to enable granular cost breakdowns.

GET /v1/token-usage/summary

curl "https://api.drd.io/v1/token-usage/summary?\
  period=monthly&\
  groupBy=project,model&\
  from=2026-01-01&\
  to=2026-01-31" \
  -H "Authorization: Bearer drd_ws_..."

// Response
{
  "period": { "from": "2026-01-01", "to": "2026-01-31" },
  "totalCost": 4523.67,
  "totalTokens": 48291034,
  "groups": [
    {
      "project": "content-pipeline",
      "model": "claude-opus-4-6",
      "inputTokens": 12500000,
      "outputTokens": 3200000,
      "cost": 1890.00,
      "requestCount": 8420
    },
    {
      "project": "content-pipeline",
      "model": "gpt-5.2",
      "inputTokens": 8900000,
      "outputTokens": 2100000,
      "cost": 1245.30,
      "requestCount": 5210
    },
    {
      "project": "marketing-compliance",
      "model": "gemini-3-pro",
      "inputTokens": 15000000,
      "outputTokens": 4200000,
      "cost": 988.37,
      "requestCount": 12400
    }
  ]
}

Usage Analytics

Every API call is logged with input/output token counts, cost, model, and provider. Query historical usage for analytics, trend analysis, and billing reconciliation.

analytics.ts

// Get usage trends over time
const trends = await drd.tokens.analytics({
  period: "daily",
  from: "2026-01-01",
  to: "2026-01-31",
  groupBy: "provider",
});

for (const day of trends.dataPoints) {
  console.log(day.date);
  for (const provider of day.groups) {
    console.log(`  ${provider.name}: ${provider.totalTokens} tokens, $${provider.cost}`);
  }
}

// Get top consumers
const topAgents = await drd.tokens.topConsumers({
  period: "monthly",
  metric: "cost",
  limit: 10,
});

for (const agent of topAgents) {
  console.log(`${agent.agentName}: $${agent.totalCost} (${agent.totalTokens} tokens)`);
}

// Get model efficiency metrics
const efficiency = await drd.tokens.efficiency({
  agentId: "01956abc-def0...",
  period: "weekly",
});

console.log("Avg input/output ratio:", efficiency.avgInputOutputRatio);
console.log("Cache hit rate:", efficiency.cacheHitRate);
console.log("Cost per task:", efficiency.avgCostPerTask);

Alerts

Configure alerts for anomalous usage patterns, budget thresholds, and cost spikes. Alerts can be delivered via email, Slack, PagerDuty, or webhook.

alerts.ts

// Create a spike detection alert
const alert = await drd.tokens.alerts.create({
  name: "Usage Spike Detection",
  type: "anomaly",
  config: {
    metric: "cost",
    window: "1h",
    threshold: 2.0,        // 2x normal baseline
    baselinePeriod: "7d",  // Compare against 7-day average
  },
  channels: [
    { type: "slack", webhookUrl: "https://hooks.slack.com/..." },
    { type: "email", to: ["ops@example.com"] },
  ],
  enabled: true,
});

// Create a daily summary alert
const dailySummary = await drd.tokens.alerts.create({
  name: "Daily Cost Summary",
  type: "scheduled",
  config: {
    schedule: "0 18 * * *",   // 6 PM daily
    timezone: "America/New_York",
    includeBreakdown: true,
    includeTopConsumers: true,
  },
  channels: [
    { type: "email", to: ["finance@example.com"] },
  ],
  enabled: true,
});

Next Steps

Agent Health

Monitor agent health scores

Learn more →

Monitoring

Real-time agent monitoring

Learn more →

Scheduled Tasks

Automate recurring operations

Learn more →

Trust Alerts

Alert on token consumption anomalies

Learn more →