REST Ingestion API
Send LLM exchange data to Ingate asynchronously, from sidecars, background jobs, or any application that doesn't use the proxy.
Overview
The proxy captures exchanges automatically, but some architectures can't route through a proxy. The Ingestion API lets you send exchange data to Ingate after the fact via REST. Exchanges are processed through the same pipeline as proxy traffic. Extraction, logging, and eval hooks all run.
Single Exchange
POST
/api/v1/ingestIngest a single LLM exchange. Requires member role.
bash
curl -X POST https://api.ingateai.com/api/v1/ingest \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Content-Type: application/json" \
-d '{
"provider": "openai",
"path": "/v1/chat/completions",
"method": "POST",
"status_code": 200,
"latency_ms": 432,
"request_body": {
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello"}]
},
"response_body": {
"choices": [{"message": {"content": "Hi there!"}}],
"usage": {"prompt_tokens": 10, "completion_tokens": 5}
}
}'jsonResponse (202 Accepted)
{
"request_id": "generated-uuid",
"status": "accepted"
}Batch Ingestion
POST
/api/v1/ingest/batchIngest multiple exchanges in one call.
bash
curl -X POST https://api.ingateai.com/api/v1/ingest/batch \
-H "X-Ingate-Key: sk-ingate-your-key" \
-H "Content-Type: application/json" \
-d '{
"exchanges": [
{
"provider": "openai",
"path": "/v1/chat/completions",
"method": "POST",
"status_code": 200,
"latency_ms": 350,
"request_body": {...},
"response_body": {...}
},
{
"provider": "anthropic",
"path": "/v1/messages",
"method": "POST",
"status_code": 200,
"latency_ms": 520,
"request_body": {...},
"response_body": {...}
}
]
}'jsonResponse (202 Accepted)
{
"accepted": 2,
"rejected": 0,
"items": [
{ "index": 0, "request_id": "uuid-1", "accepted": true },
{ "index": 1, "request_id": "uuid-2", "accepted": true }
]
}Server-Side Extraction
When full request_body and response_body are provided, Ingate runs server-side extraction to automatically populate:
model: extracted from the request bodyprompt: user message textcompletion: assistant response texttoken_count: from response usage objectmetadata: finish reason, stop reason, etc.
Pre-extracted fields
You can also send pre-extracted fields (
model, prompt, completion,token_count) directly. If server-side extraction succeeds, it overwrites pre-extracted values. If it fails, your pre-extracted values are used as fallback.Supported API Shapes
Server-side extraction supports:
- OpenAI Chat Completions (
/v1/chat/completions) - OpenAI Responses API (
/v1/responses) - Anthropic Messages (
/v1/messages)