Usage & Cost Analytics
Real-time usage summaries, time-series data, cost tracking, and breakdowns by app, team, model, provider, user, or session.
Overview
Ingate tracks every proxied request and computes dollar cost from token counts using a built-in price table. Usage data can be queried as summaries, time-series, or breakdowns across multiple dimensions. All analytics endpoints are scoped to the current organization and require a valid API key.
Analytics data powers both the API responses documented below and the built-in Dashboard Views available in the Ingate console.
Usage Summary
/api/v1/usage/summaryAggregate usage statistics for the current org.
| Parameter | Type | Default | Description |
|---|---|---|---|
since | RFC3339 | 30 days ago | Start of time range |
until | RFC3339 | now | End of time range |
app_id | UUID | optional | Filter by app |
team_id | UUID | optional | Filter by team |
provider | string | optional | Filter by provider name |
model | string | optional | Filter by model name |
user_id | string | optional | Filter by end-user ID (passed via X-Ingate-User-Id header) |
session_id | string | optional | Filter by session ID (passed via X-Ingate-Session-Id header) |
curl "https://api.ingateai.com/api/v1/usage/summary?since=2026-03-01T00:00:00Z" \
-H "Authorization: Bearer <token>"{
"total_requests": 142857,
"total_input_tokens": 82400000,
"total_output_tokens": 31200000,
"total_cost_usd": 1284.62,
"avg_latency_ms": 842,
"p99_latency_ms": 3200,
"cache_hit_rate": 0.23,
"error_rate": 0.004,
"since": "2026-03-01T00:00:00Z",
"until": "2026-04-01T00:00:00Z"
}Time-Series
/api/v1/usage/timeseriesUsage data bucketed by time period.
Returns usage metrics grouped into fixed-width time buckets. The granularity parameter controls bucket width: hour (default) or day. All summary filters (since, until, app_id,team_id, provider, model, user_id,session_id) apply.
curl "https://api.ingateai.com/api/v1/usage/timeseries?granularity=day&since=2026-03-01T00:00:00Z" \
-H "Authorization: Bearer <token>"{
"granularity": "day",
"buckets": [
{
"timestamp": "2026-03-01T00:00:00Z",
"requests": 4820,
"input_tokens": 2800000,
"output_tokens": 1050000,
"cost_usd": 42.18,
"avg_latency_ms": 810,
"error_count": 12
},
{
"timestamp": "2026-03-02T00:00:00Z",
"requests": 5104,
"input_tokens": 3100000,
"output_tokens": 1180000,
"cost_usd": 47.93,
"avg_latency_ms": 795,
"error_count": 8
}
]
}Breakdown
/api/v1/usage/breakdownUsage grouped by a dimension.
The group_by parameter is required. Valid values:
| Value | Groups by |
|---|---|
app | Application |
team | Team |
model | Model name |
provider | Provider name |
user_id | End-user ID |
session_id | Session ID |
All summary filters apply. Results are sorted by total_cost_usd descending and limited to the top 100 groups by default (use limit to adjust).
# Breakdown by model for the last 7 days
curl "https://api.ingateai.com/api/v1/usage/breakdown?group_by=model&since=2026-03-28T00:00:00Z" \
-H "Authorization: Bearer <token>"{
"group_by": "model",
"groups": [
{
"key": "gpt-4o",
"requests": 52400,
"input_tokens": 31000000,
"output_tokens": 12800000,
"cost_usd": 420.00
},
{
"key": "claude-sonnet-4-20250514",
"requests": 28100,
"input_tokens": 18200000,
"output_tokens": 7400000,
"cost_usd": 234.60
}
]
}# Per-user breakdown for a specific app
curl "https://api.ingateai.com/api/v1/usage/breakdown?group_by=user_id&app_id=app-uuid-here" \
-H "Authorization: Bearer <token>"Cost Tracking
Enterprise feature
Ingate calculates dollar cost for every request using a built-in price table with per-million-token rates for input and output tokens. Costs are computed automatically with no configuration needed. When a request uses a model not in the price table, tokens are still recorded but cost is reported as null.
Supported Models
| Provider | Models |
|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini, o3-mini, o4-mini |
| Anthropic | claude-sonnet-4-20250514, claude-3.5-sonnet, claude-3.5-haiku, claude-3-opus, claude-3-sonnet, claude-3-haiku |
| Ollama | Zero cost (local inference) |
Auto-updating price table
Price Table API
/api/v1/cost/pricesReturns the current price table with per-million-token rates.
curl "https://api.ingateai.com/api/v1/cost/prices" \
-H "Authorization: Bearer <token>"{
"updated_at": "2026-04-01T00:00:00Z",
"prices": [
{
"provider": "openai",
"model": "gpt-4o",
"input_per_million": 2.50,
"output_per_million": 10.00
},
{
"provider": "openai",
"model": "o4-mini",
"input_per_million": 1.10,
"output_per_million": 4.40
},
{
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"input_per_million": 3.00,
"output_per_million": 15.00
},
{
"provider": "ollama",
"model": "*",
"input_per_million": 0,
"output_per_million": 0
}
]
}Dashboard Views
The Ingate console includes a built-in analytics dashboard that visualizes usage and cost data in real time. The dashboard is available at https://console.ingateai.com/analytics and requires no additional setup.
Overview Widgets
The top of the dashboard displays summary cards for the selected time range:
- Total Requests: aggregate request count with trend indicator vs. prior period
- Total Tokens: combined input + output token count
- Total Cost: dollar spend (Enterprise only)
- Avg Latency: mean response time in milliseconds
- Cache Hit Rate: percentage of requests served from cache
- Error Rate: percentage of requests returning 4xx/5xx
Each widget shows the current value and a comparison delta against the previous equivalent period (e.g., this week vs. last week).
Charts
Below the widgets, interactive charts provide deeper visibility:
- Requests over time: stacked area chart with hourly or daily granularity, segmented by provider or model
- Token usage over time: line chart showing input vs. output tokens
- Cost over time: bar chart of daily spend, color-coded by provider (Enterprise only)
- Latency distribution: histogram of response times with p50, p95, and p99 markers
- Top models: horizontal bar chart ranking models by request volume or cost
- Breakdown table: sortable table matching the Breakdown API, groupable by app, team, model, provider, user, or session
All charts respect the global time range selector and any active filters (app, team, provider, model). Data refreshes automatically every 60 seconds while the dashboard is open.
Shareable links