Ingate is an all-in-one AI gateway for enterprise that sits between your applications and LLM providers. It combines a transparent proxy, full observability, prompt versioning, automated evaluations, provider fallback, response caching, guardrails, budget controls, and security governance in a single platform. No SDK required, no code changes needed.

Why use an AI gateway instead of separate tools?

Most AI infrastructure requires stitching together separate tools for observability, prompt management, evaluation, and request routing. Each tool adds its own SDK, configuration, and data silo. An all-in-one gateway like Ingate gives you a single integration point: one base URL change, and you get logging, evals, prompts, caching, guardrails, and budget controls working together. No SDK, no glue code, no data fragmentation.

Does Ingate require an SDK?

No. Ingate works with any HTTP client in any language. Change your base URL to point at Ingate and your existing OpenAI, Anthropic, or custom LLM code works unchanged. Ingate auto-detects the provider from the request path, so most integrations don't even need an extra header. Setup takes under 5 minutes.

What features does Ingate include?

Ingate includes: transparent proxy with SSE streaming, provider fallback, format translation (OpenAI Chat, Responses API, and Anthropic), response caching, PII redaction, prompt injection detection, 14 built-in evaluators, versioned prompt registry, interactive playground, datasets, session tracking, cost tracking, budget controls, audit logging, HMAC-signed webhooks, scoped API keys with rotation, multi-tenant RBAC, REST ingestion API, OpenTelemetry receiver, and Bring Your Own Storage (BYOS).

How does Ingate handle security?

Provider API keys are encrypted with AES-256-GCM at rest. API keys are stored as SHA-256 hashes and never logged. Ingate supports scoped keys with per-provider and per-model restrictions, zero-downtime key rotation, PII redaction on request bodies, prompt injection detection, SSRF protection on outbound requests, and a complete audit log of administrative actions. Auth endpoints are rate-limited to prevent brute-force attacks.

What plans does Ingate offer?

Ingate offers two plans. The Free plan includes the transparent proxy, SSE streaming, request logging, a 7-day retention period, and a built-in dashboard. The Enterprise plan adds prompt versioning, evaluations, datasets, playground, provider fallback, format translation, response caching, PII redaction, guardrails, webhooks, audit log, budget controls, cost tracking, Prometheus metrics, BYOS, custom retention, and a 99.9% uptime SLA.

Session & User Tracking

Track individual users and conversation sessions across LLM interactions for per-user analytics, session replay, cost attribution, and debugging.

Overview

Most LLM traffic is anonymous by default. You see request counts and token usage, but you can't answer “which user is burning through tokens?” or“what happened in that conversation?”. Session and user tracking closes this gap.

When you attach a user ID and/or session ID to your LLM requests, Ingate records them alongside every log entry. This unlocks:

Per-user cost attribution: see which users drive the most spend
Session timelines: replay the full sequence of LLM calls in a conversation
Usage breakdowns: group analytics by user or session
Debugging: trace a reported bug back to the exact exchange sequence
Abuse detection: identify users with unusual request patterns

IDs are opaque strings

User and session IDs are your own identifiers. Ingate stores them as opaque strings. Use whatever format your application already has: UUIDs, email hashes, database IDs, or any stable string up to 256 characters.

Three Extraction Points

Ingate can extract user and session IDs from three sources, depending on how your data enters the system. All three methods produce the same result: a user_id and/orsession_id attached to each log entry.

1. Proxy Headers

When routing requests through the Ingate proxy, set these headers on each request:

Header	Description
`X-Ingate-User-Id`	Your application's user identifier
`X-Ingate-Session-Id`	Conversation or session identifier

Both headers are stripped before forwarding to the upstream provider. they never reach OpenAI, Anthropic, or any other backend. This keeps your internal identifiers private.

2. Ingestion API Tags

When sending data via the REST Ingestion API, include user and session IDs in the tags object on each exchange:

jsonIngestion payload with tags

{
  "provider": "openai",
  "path": "/v1/chat/completions",
  "method": "POST",
  "status_code": 200,
  "latency_ms": 350,
  "request_body": { "..." : "..." },
  "response_body": { "..." : "..." },
  "tags": {
    "user_id": "user_42",
    "session_id": "session_abc123",
    "environment": "production"
  }
}

The tags object can contain any key-value pairs. user_id andsession_id are the two that Ingate recognizes for session tracking. Other tags are stored as metadata and appear in log entries.

3. OpenTelemetry Span Attributes

When sending traces via the OTLP endpoint, Ingate extracts user and session IDs from standard GenAI semantic convention attributes:

OTel Attribute	Ingate Field
`gen_ai.user.id`	`user_id`
`gen_ai.session.id`	`session_id`

These follow the emerging OpenTelemetry GenAI semantic conventions. Set them as span attributes on your GenAI spans alongside the standard gen_ai.system and gen_ai.request.model attributes.

Priority order

If the same exchange arrives through multiple channels (e.g. proxy + OTel), header values take precedence over OTel attributes. Ingestion API tags always win since they're set explicitly per-exchange.

Sending User & Session IDs

The most common approach is setting proxy headers. Here's how to do it in each language:

cURL

bash

curl https://api.ingateai.com/v1/chat/completions \
  -H "X-Ingate-Provider: openai" \
  -H "X-Ingate-Key: sk-ingate-your-key" \
  -H "X-Ingate-User-Id: user_42" \
  -H "X-Ingate-Session-Id: session_abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Python

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.ingateai.com/v1",
    api_key="sk-ingate-your-key",
    default_headers={
        "X-Ingate-Provider": "openai",
    }
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
    extra_headers={
        "X-Ingate-User-Id": current_user.id,
        "X-Ingate-Session-Id": session.id,
    }
)

TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.ingateai.com/v1",
  apiKey: "sk-ingate-your-key",
  defaultHeaders: {
    "X-Ingate-Provider": "openai",
  },
});

const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello" }],
  },
  {
    headers: {
      "X-Ingate-User-Id": currentUser.id,
      "X-Ingate-Session-Id": sessionId,
    },
  }
);

Don't send PII as user IDs

Avoid using email addresses, full names, or other personally identifiable information as user IDs. Use opaque identifiers (database IDs, UUIDs, or hashed values) so session data doesn't become a PII liability.

Sessions API

Query tracked sessions and replay their timelines. Sessions are created automatically when the first request with a given session_id is recorded.

List Sessions

GET

/api/v1/sessions

List sessions for the current org. Supports filtering and pagination.

Parameter	Type	Default	Description
`user_id`	string	optional	Filter sessions belonging to a specific user
`since`	RFC3339	24h ago	Return sessions with activity after this time
`until`	RFC3339	now	Return sessions with activity before this time
`limit`	int	50	Max results per page (max 200)
`offset`	int	0	Pagination offset

bash

# List recent sessions for a specific user
curl "https://api.ingateai.com/api/v1/sessions?user_id=user_42&limit=20" \
  -H "Authorization: Bearer <token>"

jsonResponse (200 OK)

{
  "sessions": [
    {
      "session_id": "session_abc123",
      "user_id": "user_42",
      "first_seen": "2026-04-04T08:15:00Z",
      "last_seen": "2026-04-04T08:32:14Z",
      "request_count": 12,
      "total_tokens": 8430,
      "models": ["gpt-4o-mini", "gpt-4o"],
      "providers": ["openai"]
    },
    {
      "session_id": "session_def456",
      "user_id": "user_42",
      "first_seen": "2026-04-03T14:01:00Z",
      "last_seen": "2026-04-03T14:18:55Z",
      "request_count": 7,
      "total_tokens": 3210,
      "models": ["gpt-4o-mini"],
      "providers": ["openai"]
    }
  ],
  "total": 2,
  "limit": 20,
  "offset": 0
}

Session Timeline

GET

/api/v1/sessions/:id/timeline

Chronological log entries for a single session, showing the full conversation replay.

Parameter	Type	Default	Description
`limit`	int	100	Max entries to return (max 500)
`offset`	int	0	Pagination offset

bash

# Get the full conversation timeline for a session
curl "https://api.ingateai.com/api/v1/sessions/session_abc123/timeline?limit=50" \
  -H "Authorization: Bearer <token>"

jsonResponse (200 OK)

{
  "session_id": "session_abc123",
  "user_id": "user_42",
  "entries": [
    {
      "request_id": "req-uuid-001",
      "timestamp": "2026-04-04T08:15:00Z",
      "provider": "openai",
      "model": "gpt-4o-mini",
      "prompt": "What is the capital of France?",
      "completion": "The capital of France is Paris.",
      "input_tokens": 12,
      "output_tokens": 8,
      "latency_ms": 280,
      "status_code": 200,
      "cost_usd": 0.000034
    },
    {
      "request_id": "req-uuid-002",
      "timestamp": "2026-04-04T08:15:45Z",
      "provider": "openai",
      "model": "gpt-4o-mini",
      "prompt": "What about Germany?",
      "completion": "The capital of Germany is Berlin.",
      "input_tokens": 28,
      "output_tokens": 9,
      "latency_ms": 310,
      "status_code": 200,
      "cost_usd": 0.000041
    }
  ],
  "total": 12,
  "limit": 50,
  "offset": 0
}

Session replay for debugging

The timeline endpoint is especially useful for debugging user-reported issues. Ask the user for their session ID (or look it up by user ID), then replay the exact sequence of LLM calls, including prompts, completions, latencies, and error codes.

User Summary

Get aggregate statistics for each user, including total requests, token consumption, cost, active sessions, and model usage. Useful for identifying power users, cost outliers, and usage trends.

GET

/api/v1/users/summary

Per-user aggregate statistics for the current org.

Parameter	Type	Default	Description
`since`	RFC3339	30 days ago	Start of time range
`until`	RFC3339	now	End of time range
`sort_by`	string	`total_cost`	Sort field: `total_cost`, `request_count`, `total_tokens`
`order`	string	`desc`	Sort order: `asc` or `desc`
`limit`	int	50	Max results (max 200)
`offset`	int	0	Pagination offset

bash

# Top users by cost in the last 7 days
curl "https://api.ingateai.com/api/v1/users/summary?since=2026-03-28T00:00:00Z&sort_by=total_cost&limit=10" \
  -H "Authorization: Bearer <token>"

jsonResponse (200 OK)

{
  "users": [
    {
      "user_id": "user_42",
      "request_count": 847,
      "total_input_tokens": 124500,
      "total_output_tokens": 89200,
      "total_tokens": 213700,
      "total_cost_usd": 4.28,
      "session_count": 34,
      "models_used": ["gpt-4o", "gpt-4o-mini"],
      "first_seen": "2026-03-01T10:00:00Z",
      "last_seen": "2026-04-04T08:32:14Z"
    },
    {
      "user_id": "user_87",
      "request_count": 312,
      "total_input_tokens": 45800,
      "total_output_tokens": 31400,
      "total_tokens": 77200,
      "total_cost_usd": 1.55,
      "session_count": 18,
      "models_used": ["gpt-4o-mini"],
      "first_seen": "2026-03-15T14:20:00Z",
      "last_seen": "2026-04-03T19:45:00Z"
    }
  ],
  "total": 156,
  "limit": 10,
  "offset": 0
}

Usage Analytics Integration

User and session IDs integrate directly with the Usage & Cost Analytics system. The/api/v1/usage/breakdown endpoint supports group_by=user andgroup_by=session dimensions, and all usage endpoints acceptuser_id and session_id as filters.

Breakdown by User

bash

# Cost breakdown grouped by user
curl "https://api.ingateai.com/api/v1/usage/breakdown?group_by=user&since=2026-03-01T00:00:00Z" \
  -H "Authorization: Bearer <token>"

jsonResponse (200 OK)

{
  "group_by": "user",
  "groups": [
    {
      "key": "user_42",
      "request_count": 847,
      "total_tokens": 213700,
      "total_cost_usd": 4.28
    },
    {
      "key": "user_87",
      "request_count": 312,
      "total_tokens": 77200,
      "total_cost_usd": 1.55
    }
  ]
}

Breakdown by Session

bash

# Token usage per session for a specific user
curl "https://api.ingateai.com/api/v1/usage/breakdown?group_by=session&user_id=user_42&since=2026-04-01T00:00:00Z" \
  -H "Authorization: Bearer <token>"

Filter Any Usage Endpoint by User or Session

The user_id and session_id query parameters work on all usage endpoints (summary, timeseries, and breakdown):

bash

# Usage summary for a single user
curl "https://api.ingateai.com/api/v1/usage/summary?user_id=user_42" \
  -H "Authorization: Bearer <token>"

# Hourly timeseries for a single session
curl "https://api.ingateai.com/api/v1/usage/timeseries?session_id=session_abc123&granularity=hour" \
  -H "Authorization: Bearer <token>"

Dashboard

The Ingate dashboard provides visual interfaces for exploring session and user data without writing API calls.

Sessions List

Navigate to Sessions in the sidebar to see all tracked sessions. The list view shows:

Session ID and associated user ID
First and last activity timestamps
Request count and total tokens
Models and providers used

Use the filter bar to narrow by user ID, date range, or model. Click any session row to open its timeline.

Session Timeline View

The timeline view displays each LLM exchange in chronological order as a visual replay of the conversation. Each entry shows:

Timestamp and latency
Model and provider
Prompt and completion text (expandable)
Token counts and cost
Status code and any error details

Link directly to a session

Share session links with your team for debugging:https://api.ingateai.com/ui/sessions/session_abc123

User Analytics View

Navigate to Users in the sidebar for the per-user analytics view. This page shows:

User table: sortable by cost, request count, token usage, or last active date
User detail: click a user to see their sessions, usage over time, model breakdown, and top conversations
Cost distribution: chart showing relative spend across users

Data retention

Session and user data follows your organization's retention policy. Free plans retain 7 days of history. Enterprise plans support configurable retention up to unlimited.