Usage & Cost Analytics

Real-time usage summaries, time-series data, cost tracking, and breakdowns by app, team, model, provider, user, or session.

Overview

Ingate tracks every proxied request and computes dollar cost from token counts using a built-in price table. Usage data can be queried as summaries, time-series, or breakdowns across multiple dimensions. All analytics endpoints are scoped to the current organization and require a valid API key.

Analytics data powers both the API responses documented below and the built-in Dashboard Views available in the Ingate console.

Usage Summary

GET
/api/v1/usage/summary

Aggregate usage statistics for the current org.

ParameterTypeDefaultDescription
sinceRFC333930 days agoStart of time range
untilRFC3339nowEnd of time range
app_idUUIDoptionalFilter by app
team_idUUIDoptionalFilter by team
providerstringoptionalFilter by provider name
modelstringoptionalFilter by model name
user_idstringoptionalFilter by end-user ID (passed via X-Ingate-User-Id header)
session_idstringoptionalFilter by session ID (passed via X-Ingate-Session-Id header)
bash
curl "https://api.ingateai.com/api/v1/usage/summary?since=2026-03-01T00:00:00Z" \
  -H "Authorization: Bearer <token>"
jsonResponse
{
  "total_requests": 142857,
  "total_input_tokens": 82400000,
  "total_output_tokens": 31200000,
  "total_cost_usd": 1284.62,
  "avg_latency_ms": 842,
  "p99_latency_ms": 3200,
  "cache_hit_rate": 0.23,
  "error_rate": 0.004,
  "since": "2026-03-01T00:00:00Z",
  "until": "2026-04-01T00:00:00Z"
}

Time-Series

GET
/api/v1/usage/timeseries

Usage data bucketed by time period.

Returns usage metrics grouped into fixed-width time buckets. The granularity parameter controls bucket width: hour (default) or day. All summary filters (since, until, app_id,team_id, provider, model, user_id,session_id) apply.

bash
curl "https://api.ingateai.com/api/v1/usage/timeseries?granularity=day&since=2026-03-01T00:00:00Z" \
  -H "Authorization: Bearer <token>"
jsonResponse
{
  "granularity": "day",
  "buckets": [
    {
      "timestamp": "2026-03-01T00:00:00Z",
      "requests": 4820,
      "input_tokens": 2800000,
      "output_tokens": 1050000,
      "cost_usd": 42.18,
      "avg_latency_ms": 810,
      "error_count": 12
    },
    {
      "timestamp": "2026-03-02T00:00:00Z",
      "requests": 5104,
      "input_tokens": 3100000,
      "output_tokens": 1180000,
      "cost_usd": 47.93,
      "avg_latency_ms": 795,
      "error_count": 8
    }
  ]
}

Breakdown

GET
/api/v1/usage/breakdown

Usage grouped by a dimension.

The group_by parameter is required. Valid values:

ValueGroups by
appApplication
teamTeam
modelModel name
providerProvider name
user_idEnd-user ID
session_idSession ID

All summary filters apply. Results are sorted by total_cost_usd descending and limited to the top 100 groups by default (use limit to adjust).

bash
# Breakdown by model for the last 7 days
curl "https://api.ingateai.com/api/v1/usage/breakdown?group_by=model&since=2026-03-28T00:00:00Z" \
  -H "Authorization: Bearer <token>"
jsonResponse
{
  "group_by": "model",
  "groups": [
    {
      "key": "gpt-4o",
      "requests": 52400,
      "input_tokens": 31000000,
      "output_tokens": 12800000,
      "cost_usd": 420.00
    },
    {
      "key": "claude-sonnet-4-20250514",
      "requests": 28100,
      "input_tokens": 18200000,
      "output_tokens": 7400000,
      "cost_usd": 234.60
    }
  ]
}
bash
# Per-user breakdown for a specific app
curl "https://api.ingateai.com/api/v1/usage/breakdown?group_by=user_id&app_id=app-uuid-here" \
  -H "Authorization: Bearer <token>"

Cost Tracking

Enterprise feature

Cost tracking is available on Enterprise plans. Free-plan orgs see token counts but not dollar costs.

Ingate calculates dollar cost for every request using a built-in price table with per-million-token rates for input and output tokens. Costs are computed automatically with no configuration needed. When a request uses a model not in the price table, tokens are still recorded but cost is reported as null.

Supported Models

ProviderModels
OpenAIgpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini, o3-mini, o4-mini
Anthropicclaude-sonnet-4-20250514, claude-3.5-sonnet, claude-3.5-haiku, claude-3-opus, claude-3-sonnet, claude-3-haiku
OllamaZero cost (local inference)

Auto-updating price table

The price table is updated automatically as new models are released. You do not need to redeploy or change any configuration. Ingate pulls the latest pricing data on a regular schedule. Query the price table API below to see the current rates at any time.

Price Table API

GET
/api/v1/cost/prices

Returns the current price table with per-million-token rates.

bash
curl "https://api.ingateai.com/api/v1/cost/prices" \
  -H "Authorization: Bearer <token>"
jsonResponse
{
  "updated_at": "2026-04-01T00:00:00Z",
  "prices": [
    {
      "provider": "openai",
      "model": "gpt-4o",
      "input_per_million": 2.50,
      "output_per_million": 10.00
    },
    {
      "provider": "openai",
      "model": "o4-mini",
      "input_per_million": 1.10,
      "output_per_million": 4.40
    },
    {
      "provider": "anthropic",
      "model": "claude-sonnet-4-20250514",
      "input_per_million": 3.00,
      "output_per_million": 15.00
    },
    {
      "provider": "ollama",
      "model": "*",
      "input_per_million": 0,
      "output_per_million": 0
    }
  ]
}

Dashboard Views

The Ingate console includes a built-in analytics dashboard that visualizes usage and cost data in real time. The dashboard is available at https://console.ingateai.com/analytics and requires no additional setup.

Overview Widgets

The top of the dashboard displays summary cards for the selected time range:

  • Total Requests: aggregate request count with trend indicator vs. prior period
  • Total Tokens: combined input + output token count
  • Total Cost: dollar spend (Enterprise only)
  • Avg Latency: mean response time in milliseconds
  • Cache Hit Rate: percentage of requests served from cache
  • Error Rate: percentage of requests returning 4xx/5xx

Each widget shows the current value and a comparison delta against the previous equivalent period (e.g., this week vs. last week).

Charts

Below the widgets, interactive charts provide deeper visibility:

  • Requests over time: stacked area chart with hourly or daily granularity, segmented by provider or model
  • Token usage over time: line chart showing input vs. output tokens
  • Cost over time: bar chart of daily spend, color-coded by provider (Enterprise only)
  • Latency distribution: histogram of response times with p50, p95, and p99 markers
  • Top models: horizontal bar chart ranking models by request volume or cost
  • Breakdown table: sortable table matching the Breakdown API, groupable by app, team, model, provider, user, or session

All charts respect the global time range selector and any active filters (app, team, provider, model). Data refreshes automatically every 60 seconds while the dashboard is open.

Shareable links

Filter selections are encoded in the URL query string. Copy the URL to share a specific dashboard view with teammates.