Proxy & Routing

How Ingate routes LLM requests to providers with automatic path detection, wildcard routing, SSE streaming passthrough, and sub-millisecond overhead.

How Routing Works

Ingate uses wildcard routing: any request path that doesn't match an internal route (/api/* or /ui/*) is treated as a proxy request and forwarded to the target provider.

bash
# All of these are proxied. Ingate doesn't hard-code endpoint paths
POST /v1/chat/completions      → proxied
POST /v1/responses             → proxied
POST /v1/messages              → proxied
POST /v1/embeddings            → proxied
POST /api/generate             → proxied (Ollama)
POST /anything/you/want        → proxied

# These are internal Ingate routes (not proxied)
GET  /api/v1/logs              → Ingate API
GET  /ui/dashboard             → Ingate dashboard

Provider Resolution

Ingate determines which provider to route to using three methods, checked in order:

  1. Explicit header: set X-Ingate-Provider: openai to target a specific provider by name.
  2. Path auto-detection: Ingate recognizes known API paths and routes automatically. For example, /v1/chat/completions routes to your configured OpenAI provider, /v1/messages routes to Anthropic.
  3. Default provider: if no header is set and the path isn't recognized, the request routes to your organization's default provider.

Header is optional for known paths

Because of auto-detection, you don't need to set X-Ingate-Provider for standard OpenAI, Anthropic, or Ollama paths. Just point your SDK at Ingate and it works.
bash
# Auto-detected, no X-Ingate-Provider needed
curl https://api.ingateai.com/v1/chat/completions \
  -H "X-Ingate-Key: sk-ingate-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
bash
# Explicit provider, useful for custom endpoints or overriding auto-detect
curl https://api.ingateai.com/v1/chat/completions \
  -H "X-Ingate-Provider: azure-openai" \
  -H "X-Ingate-Key: sk-ingate-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

Supported API Shapes

Ingate works with any HTTP endpoint, but has first-class extraction for these shapes:

API ShapeExample PathAuto-Detected
OpenAI Chat Completions/v1/chat/completionsYes
OpenAI Responses API/v1/responsesYes
Anthropic Messages/v1/messagesYes
Ollama/api/generate, /api/chatYes
Custom / UnknownAny pathNo, use header or default

Request Headers

HeaderRequiredDescription
X-Ingate-KeyYesYour Ingate API key. Authenticates the request and determines your org, plan, and entitlements.
X-Ingate-ProviderNoTarget provider name. Optional when auto-detection or a default provider can resolve the target.
X-Ingate-TranslateNoSet to true to opt into request/response format translation between API shapes.
X-Ingate-User-IdNoArbitrary user identifier. Logged with the request for per-user analytics and filtering.
X-Ingate-Session-IdNoArbitrary session identifier. Groups related requests in logs for conversation-level tracking.

Ingate headers are stripped

All X-Ingate-* headers are stripped from the upstream request before forwarding. The provider never sees them.

Response Headers

Ingate adds the following headers to every proxied response, alongside standard security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy):

HeaderDescription
X-Ingate-Request-IdUnique UUID for this request. Use it to look up logs, debug issues, and correlate across systems.
X-Ingate-Served-ByPresent only on fallback. Shows which provider actually served the response when the primary failed.

Streaming

Ingate supports SSE (Server-Sent Events) streaming with zero-buffering passthrough. When a provider streams a response, each chunk flows through Ingate to the client as it arrives. Nothing is buffered.

  • Internal overhead: <500µs per request
  • Time-to-first-token: identical to a direct provider connection
  • Behavior: set "stream": true in your request body as normal. Ingate detects the streaming response and switches to passthrough mode automatically
bash
# Streaming works exactly as you'd expect
curl https://api.ingateai.com/v1/chat/completions \
  -H "X-Ingate-Key: sk-ingate-your-key" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gpt-4o",
    "stream": true,
    "messages": [{"role": "user", "content": "Write a haiku"}]
  }'

Streaming and logging

Streaming responses are logged after the stream completes. The full response is reassembled from chunks for token counting and cost calculation, but this happens asynchronously and never delays the stream.

What Gets Logged

Every proxied request is sent to an async logging pipeline. Your application's response is never delayed by logging.

For known API shapes (OpenAI, Anthropic, Ollama), Ingate automatically extracts structured fields:

FieldDescription
modelModel name from the request or response
tokensPrompt, completion, and total token counts
latencyEnd-to-end request duration in milliseconds
statusHTTP status code from the provider
costEstimated cost based on model pricing
user_idFrom X-Ingate-User-Id header, if set
session_idFrom X-Ingate-Session-Id header, if set

Unknown API shapes

For custom or unrecognized endpoints, Ingate captures the raw request and response bodies instead of extracting structured fields. You get full observability regardless of the API shape.

Auth Passthrough

When a provider is configured in passthrough auth mode (no API key stored in Ingate), the client's Authorization header passes through to the upstream provider unchanged. This is useful when:

  • Each client has their own provider API key
  • You want logging and analytics without centralizing keys
  • You're proxying to internal services that use their own auth
bash
# Auth passthrough: your Authorization header reaches the provider
curl https://api.ingateai.com/v1/chat/completions \
  -H "X-Ingate-Key: sk-ingate-your-key" \
  -H "Authorization: Bearer sk-your-openai-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

Passthrough vs. injection

Auth passthrough only applies when the provider has no API key configured in Ingate. If a provider has a stored key, Ingate always injects it and overwrites any client-sent Authorization header.

Provider Key Injection

When a provider is configured with an API key (the typical setup), Ingate injects it into the upstream request automatically. This means:

  • Your clients never need the provider's API key. Only the Ingate key is needed
  • The Authorization header is set based on the provider's configured auth scheme
  • All X-Ingate-* headers are stripped before forwarding
  • Provider keys stay server-side. They never appear in client code or logs
bash
# What the client sends:
curl https://api.ingateai.com/v1/chat/completions \
  -H "X-Ingate-Key: sk-ingate-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

# What the provider receives:
# POST https://api.openai.com/v1/chat/completions
# Authorization: Bearer sk-openai-stored-key
# Content-Type: application/json
# (no X-Ingate-* headers)

Providers support flexible auth configuration to work with any upstream service:

Provider ConfigUpstream HeaderExample
Default (Bearer)Authorization: Bearer <key>OpenAI, Ollama
Custom header + schemex-api-key: <key>Anthropic
Query parameterAppended as ?key=<key>Legacy APIs

See also