Three-Axis Model
Omegon organizes inference along three independent axes. The agent adjusts these autonomously based on task complexity, or operators can override explicitly. This gives fine-grained control over capability, cost, and context without coupling them.
Axis 1: Capability Tier
Controls which model family is used. The agent switches tiers with set_model_tier.
| Tier | Use Case | Typical Models |
|---|---|---|
| local | On-device, zero API cost | Ollama (qwen3, devstral, llama) |
| retribution | Simple lookups, boilerplate | Claude Haiku, GPT-4o-mini, Groq |
| victory | Routine coding, execution | Claude Sonnet, GPT-4o, Codestral |
| gloriana | Architecture, deep reasoning | Claude Opus, o1-pro |
/model # Opens selector with all available models
set_model_tier(victory) # Agent self-selects (automatic) Axis 2: Thinking Level
Controls extended reasoning budget — how many tokens the model spends "thinking" before responding.
| Level | Use Case |
|---|---|
| off | No extended thinking. Fastest, cheapest. |
| minimal | Brief sanity check. |
| low | Light reasoning for straightforward tasks. |
| medium | Default. Balanced reasoning for general work. |
| high | Deep reasoning for architecture, debugging, multi-step problems. |
/think high # Manual override
/think off # Disable for speed Axis 3: Context Class
Controls context window capacity. Larger windows allow more conversation history and memory injection.
| Class | Window | Use Case |
|---|---|---|
| Squad | 128k tokens | Quick tasks, simple edits |
| Maniple | 272k tokens | Standard development work |
| Clan | 400k tokens | Large refactors, multi-file analysis |
| Legion | 1M tokens | Massive codebases, full-project comprehension |
/context legion # Pin to maximum context
/context squad # Minimize for cost Orthogonality
The three axes are independent. You can run gloriana + off + squad (powerful model,
no thinking, small window) or retribution + high + legion (cheap model, deep reasoning,
huge context). The agent tunes these combinations based on what it's doing:
- Simple file read →
victory + off - Architecture decision →
gloriana + high - Boilerplate generation →
retribution + minimal - Large codebase audit →
victory + medium + legion
Provider Matrix
Which providers serve which tiers depends on available credentials and model capabilities:
| Provider | local | retribution | victory | gloriana |
|---|---|---|---|---|
| Anthropic | — | Haiku | Sonnet | Opus |
| OpenAI | — | 4o-mini | 4o / o3 | o1-pro |
| Codex | — | — | codex-mini | — |
| OpenRouter | — | varies | varies | varies |
| Groq | — | ✓ | ✓ | — |
| xAI | — | — | Grok | — |
| Mistral | — | — | Codestral | — |
| Cerebras | — | ✓ | — | — |
| Ollama | ✓ | — | — | — |
The routing engine (routing.rs) scores available providers against the requested
tier and selects the best match. See Providers for details.