Providers
Omegon connects to inference providers through native Rust HTTP clients — no subprocess shims, no Node.js at runtime. Each provider has a dedicated client that handles authentication, streaming, tool call wire formats, and error recovery natively.
Inference Providers
| Provider | Client | Auth | Env Var |
|---|---|---|---|
| Anthropic | AnthropicClient (native SSE) | OAuth or API key | ANTHROPIC_API_KEY |
| OpenAI | OpenAIClient (native) | OAuth or API key | OPENAI_API_KEY |
| OpenAI Codex | CodexClient (Responses API) | OAuth JWT | CHATGPT_OAUTH_TOKEN |
| OpenRouter | OpenRouterClient (wrapper) | API key | OPENROUTER_API_KEY |
| Groq | OpenAICompatClient | API key | GROQ_API_KEY |
| xAI (Grok) | OpenAICompatClient | API key | XAI_API_KEY |
| Mistral | OpenAICompatClient | API key | MISTRAL_API_KEY |
| Cerebras | OpenAICompatClient | API key | CEREBRAS_API_KEY |
| HuggingFace | OpenAICompatClient | API key | HF_TOKEN |
| Ollama | OpenAICompatClient | Local (no auth) | OLLAMA_HOST |
Search Providers
| Provider | Auth | Env Var |
|---|---|---|
| Brave Search | API key | BRAVE_API_KEY |
| Tavily | API key | TAVILY_API_KEY |
| Serper | API key | SERPER_API_KEY |
Client Architecture
There are 5 distinct Rust client implementations:
- AnthropicClient — Native SSE streaming against the Anthropic Messages API. Handles thinking blocks, tool use, and multi-turn continuity natively.
- OpenAIClient — Native client for the OpenAI Chat Completions API. Handles function calling, streaming, and structured outputs.
- CodexClient — Implements the OpenAI Responses API (completely different wire protocol from Chat Completions). Uses JWT OAuth with account ID extraction from claims. Handles compound tool call IDs and SSE parsing.
- OpenRouterClient — Wraps OpenAIClient with base URL override and model prefix handling for OpenRouter's 200+ model catalog.
- OpenAICompatClient — Generic client for the 6 providers that implement the OpenAI-compatible API (Groq, xAI, Mistral, Cerebras, HuggingFace, Ollama). Wraps OpenAIClient with per-provider base URL and model mapping.
Routing & Failover
The routing engine (routing.rs) scores available providers against the requested
capability tier and selects the best match. If the primary provider fails, it automatically
falls back to the next-best provider. See Three-Axis Model.
- Providers are scored by capability match, latency history, and rate-limit state
- Rate-limited providers enter cooldown and are deprioritized until recovery
- Routing tracks provider usage and rate-limit headers so retries and failover can respond to real runtime signals
- The agent can switch providers mid-session with
set_model_tier - Local inference (Ollama) is always available as a zero-cost fallback
Authentication
Credentials are stored in the system keychain (macOS Keychain, Linux Secret Service) or encrypted on disk. OAuth tokens are refreshed automatically.
# OAuth login (Claude Pro/Max, ChatGPT Plus/Pro)
omegon login
/login anthropic
/login openai
# API key (set in environment or via /secrets)
export ANTHROPIC_API_KEY="sk-ant-..."
/secrets set groq GROQ_API_KEY
# Check status
/auth status