CtlTower
Headless, OpenAI-compatible AI router. Point your existing OpenAI client at CtlTower and it dispatches each request to the cheapest capable model across Anthropic, OpenAI, Google, and xAI — with per-key routing preferences and automatic provider fallback. Same request shape, same response shape.
Base URL https://ctltower.com/v1 · OpenAI-compatible · no SDK required
Quickstart
CtlTower speaks the OpenAI Chat Completions API. If your code already callsapi.openai.com, change two things: the base URL and the API key. That's the whole integration — the fetch call is the API, there is no client library.
const res = await fetch('https://ctltower.com/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.CTLTOWER_API_KEY}`,
},
body: JSON.stringify({
model: 'auto', // let CtlTower pick the model (see Model selection)
messages: [{ role: 'user', content: 'Hello' }],
}),
})
const data = await res.json()
console.log(data.choices[0].message.content)The response is the standard OpenAI chat.completion object — parse data.choices[0].messageexactly as you would OpenAI's. Works with the official OpenAI SDKs too: set baseURL tohttps://ctltower.com/v1 and apiKey to your secret.
Authentication
Every request needs a bearer token: Authorization: Bearer <secret>. You do not create your own — the CtlTower operator mints a labeled key and hands you the secret. Store it as a private env var (convention: CTLTOWER_API_KEY) and never ship it client-side.
A missing or invalid bearer returns 401. The same secret works on every endpoint below.
Model selection
The model field controls routing. All forms are optional — omit it and CtlTower classifies the request for you.
omit, "auto", or ""a tier name — "simple", "moderate", "orchestrator"a qualified id — "anthropic/claude-opus-4-7"anything elseTiers map to model classes (e.g. simple → Haiku/Flash-class,moderate → Sonnet/GPT-4o-class, orchestrator → Opus-class). Your operator may also set a default_tier ormin_tier on your key — ask them if you need a guaranteed floor.
Streaming
Pass stream: true for an OpenAI-compatible Server-Sent Events response. Chunks are data: {...} lines; the stream ends withdata: [DONE].
const res = await fetch('https://ctltower.com/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.CTLTOWER_API_KEY}`,
},
body: JSON.stringify({ messages, stream: true }),
})
const reader = res.body.getReader()
const decoder = new TextDecoder()
for (;;) {
const { done, value } = await reader.read()
if (done) break
// decoder.decode(value) → one or more "data: {...}\n\n" lines, OpenAI-shape
}- Resolved model + fallbacks ride in the first chunk's non-standard
x_routingfield (routing isn't known until the first chunk, so it can't be a header). - Streaming requests skip the classifier (latency would defeat the UX). Default tier is
moderate; passmodel: "orchestrator"or a qualified id for a stronger model. - Provider fallback happens only before the first byte. After bytes are sent, a mid-stream provider failure arrives as a final
errorchunk.
Tools (function calling)
Send tools in the standard OpenAI shape. CtlTower forwards them to whichever provider serves the request and returns tool_calls the same way OpenAI does. You execute the tool — CtlTower never runs it — then send the result back as a role: "tool"message on the next turn.
// Turn 1: model returns tool_calls
const data = await res.json()
const choice = data.choices[0]
if (choice.finish_reason === 'tool_calls') {
messages.push(choice.message) // the assistant turn
for (const call of choice.message.tool_calls) {
const args = JSON.parse(call.function.arguments)
const result = await yourTools[call.function.name](args)
messages.push({
role: 'tool',
tool_call_id: call.id,
content: JSON.stringify(result),
})
}
// Turn 2: POST again with the appended messages (same tools array)
}- Tool-bearing requests automatically route to a capable model (Sonnet-class or better) — small models with tools are unreliable.
- Echo the same
toolsarray on every turn of the conversation, and include the full message history (CtlTower is stateless). parallel_tool_calls: falseis honored natively on Anthropic / OpenAI / Grok; Gemini truncates extras post-hoc.
Image input
User messages may carry multi-part content mixing text and images:
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{ type: 'image_url', image_url: { url: 'data:image/png;base64,...' } },
],
}OpenAI, Grok, and Anthropic accept both data: andhttps:// URLs. Gemini accepts data: URLs only — a Gemini-served request with an https:// image will fall through to the next provider in the chain.
Structured JSON output
response_format is supported in both forms:
response_format: { type: 'json_object' }
// or
response_format: {
type: 'json_schema',
json_schema: { name: 'my_schema', schema: { /* JSON Schema */ }, strict: true },
}Honored natively on OpenAI / Grok / Gemini. Anthropic has no equivalent, so CtlTower synthesizes a forced tool internally and unwraps the JSON back into the response text — the caller sees a uniform JSON-string body regardless of which provider answered.
Embeddings
POST /v1/embeddings — OpenAI-compatible vector embeddings, through your existing bearer. input is a string or an array of strings:
const res = await fetch('https://ctltower.com/v1/embeddings', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.CTLTOWER_API_KEY}`,
},
body: JSON.stringify({
input: ['first text', 'second text'], // string or string[]
// model: omit for the default, or "text-embedding-3-small", etc.
// dimensions: 512, // optional (3-* models only)
}),
})
const { data } = await res.json()
// data[i].embedding is the vector for input[i], in order.Response is the standard OpenAI embeddings envelope:{ object: "list", data: [{ object, index, embedding }], model, usage }.encoding_format (float default, or base64) and dimensions pass through. Token-array input is not supported — send text.
Audio (STT + TTS)
Two more OpenAI-compatible endpoints, same bearer:
POST /v1/audio/transcriptions — speech to textfile + optional model /prompt / language / response_format. Returns { text } (or plain text for srt/vtt). Whisper-compatible.POST /v1/audio/speech — text to speech{ input, voice, response_format?, speed? }. Returns raw audio bytes with the matching Content-Type. tts-1-compatible.Response headers
Non-streaming responses carry routing metadata you can log or surface:
X-TskPilot-Request-IdX-TskPilot-Resolved-ModelX-TskPilot-ComplexityX-TskPilot-Fallbacks(Header names carry the X-TskPilot-prefix for historical reasons — CtlTower was formerly TskPilot. The names are stable; don't rely on them changing.)
Endpoints
POST /v1/chat/completionsBearerPOST /v1/embeddingsBearerPOST /v1/audio/transcriptionsBearerPOST /v1/audio/speechBearerGET /api/configBearerGET /api/recentBearerCaching
CtlTower caches its own internal routing decisions, which is invisible to you — the response shape is identical whether the routing was cached or computed fresh. It does not cache model responses, so you never get a stale answer to a time- or context-sensitive question.
A cache_hint: 'stable' | 'volatile' | 'auto' request field is accepted and validated but currently inert — reserved for a future response cache. Sending it has no effect today.