Proxy & Routing
How Ingate routes LLM requests to providers with automatic path detection, wildcard routing, SSE streaming passthrough, and sub-millisecond overhead.
How Routing Works
Ingate uses wildcard routing: any request path that doesn't match an internal route (/api/* or /ui/*) is treated as a proxy request and forwarded to the target provider.
# All of these are proxied. Ingate doesn't hard-code endpoint paths
POST /v1/chat/completions → proxied
POST /v1/responses → proxied
POST /v1/messages → proxied
POST /v1/embeddings → proxied
POST /api/generate → proxied (Ollama)
POST /anything/you/want → proxied
# These are internal Ingate routes (not proxied)
GET /api/v1/logs → Ingate API
GET /ui/dashboard → Ingate dashboardProvider Resolution
Ingate determines which provider to route to using three methods, checked in order:
- Explicit header: set
X-Ingate-Provider: openaito target a specific provider by name. - Path auto-detection: Ingate recognizes known API paths and routes automatically. For example,
/v1/chat/completionsroutes to your configured OpenAI provider,/v1/messagesroutes to Anthropic. - Default provider: if no header is set and the path isn't recognized, the request routes to your organization's default provider.
Header is optional for known paths
X-Ingate-Provider for standard OpenAI, Anthropic, or Ollama paths. Just point your SDK at Ingate and it works.# Auto-detected, no X-Ingate-Provider needed
curl https://api.ingateai.com/v1/chat/completions \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'# Explicit provider, useful for custom endpoints or overriding auto-detect
curl https://api.ingateai.com/v1/chat/completions \
-H "X-Ingate-Provider: azure-openai" \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'Supported API Shapes
Ingate works with any HTTP endpoint, but has first-class extraction for these shapes:
| API Shape | Example Path | Auto-Detected |
|---|---|---|
| OpenAI Chat Completions | /v1/chat/completions | Yes |
| OpenAI Responses API | /v1/responses | Yes |
| Anthropic Messages | /v1/messages | Yes |
| Ollama | /api/generate, /api/chat | Yes |
| Custom / Unknown | Any path | No, use header or default |
Request Headers
| Header | Required | Description |
|---|---|---|
X-Ingate-Key | Yes | Your Ingate API key. Authenticates the request and determines your org, plan, and entitlements. |
X-Ingate-Provider | No | Target provider name. Optional when auto-detection or a default provider can resolve the target. |
X-Ingate-Translate | No | Set to true to opt into request/response format translation between API shapes. |
X-Ingate-User-Id | No | Arbitrary user identifier. Logged with the request for per-user analytics and filtering. |
X-Ingate-Session-Id | No | Arbitrary session identifier. Groups related requests in logs for conversation-level tracking. |
Ingate headers are stripped
X-Ingate-* headers are stripped from the upstream request before forwarding. The provider never sees them.Response Headers
Ingate adds the following headers to every proxied response, alongside standard security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy):
| Header | Description |
|---|---|
X-Ingate-Request-Id | Unique UUID for this request. Use it to look up logs, debug issues, and correlate across systems. |
X-Ingate-Served-By | Present only on fallback. Shows which provider actually served the response when the primary failed. |
Streaming
Ingate supports SSE (Server-Sent Events) streaming with zero-buffering passthrough. When a provider streams a response, each chunk flows through Ingate to the client as it arrives. Nothing is buffered.
- Internal overhead: <500µs per request
- Time-to-first-token: identical to a direct provider connection
- Behavior: set
"stream": truein your request body as normal. Ingate detects the streaming response and switches to passthrough mode automatically
# Streaming works exactly as you'd expect
curl https://api.ingateai.com/v1/chat/completions \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "gpt-4o",
"stream": true,
"messages": [{"role": "user", "content": "Write a haiku"}]
}'Streaming and logging
What Gets Logged
Every proxied request is sent to an async logging pipeline. Your application's response is never delayed by logging.
For known API shapes (OpenAI, Anthropic, Ollama), Ingate automatically extracts structured fields:
| Field | Description |
|---|---|
model | Model name from the request or response |
tokens | Prompt, completion, and total token counts |
latency | End-to-end request duration in milliseconds |
status | HTTP status code from the provider |
cost | Estimated cost based on model pricing |
user_id | From X-Ingate-User-Id header, if set |
session_id | From X-Ingate-Session-Id header, if set |
Unknown API shapes
Auth Passthrough
When a provider is configured in passthrough auth mode (no API key stored in Ingate), the client's Authorization header passes through to the upstream provider unchanged. This is useful when:
- Each client has their own provider API key
- You want logging and analytics without centralizing keys
- You're proxying to internal services that use their own auth
# Auth passthrough: your Authorization header reaches the provider
curl https://api.ingateai.com/v1/chat/completions \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Authorization: Bearer sk-your-openai-key" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'Passthrough vs. injection
Authorization header.Provider Key Injection
When a provider is configured with an API key (the typical setup), Ingate injects it into the upstream request automatically. This means:
- Your clients never need the provider's API key. Only the Ingate key is needed
- The
Authorizationheader is set based on the provider's configured auth scheme - All
X-Ingate-*headers are stripped before forwarding - Provider keys stay server-side. They never appear in client code or logs
# What the client sends:
curl https://api.ingateai.com/v1/chat/completions \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
# What the provider receives:
# POST https://api.openai.com/v1/chat/completions
# Authorization: Bearer sk-openai-stored-key
# Content-Type: application/json
# (no X-Ingate-* headers)Providers support flexible auth configuration to work with any upstream service:
| Provider Config | Upstream Header | Example |
|---|---|---|
Default (Bearer) | Authorization: Bearer <key> | OpenAI, Ollama |
| Custom header + scheme | x-api-key: <key> | Anthropic |
| Query parameter | Appended as ?key=<key> | Legacy APIs |
See also
- Authentication: API keys, scopes, rotation
- Configuration: provider setup and auth options
- Audit Log: querying logged requests