Routing Policy
Omegon's routing layer selects concrete provider/model combinations for each inference request based on three constraints: capability tier, thinking level, and context class.
Selection Flow
- Filter to authenticated providers — only models with configured API keys or active OAuth sessions are considered
- Match capability tier — filter to models that satisfy the requested tier (higher tiers can serve lower requests)
- Check context floor — exclude models whose context ceiling is below the session's required minimum
- Apply provider preference — order by the operator's provider preference (default: Anthropic first)
- Check cooldowns — exclude providers/models that are in a cooldown period from recent failures
- Select best candidate — first viable match wins
Downgrade Classification
When a model switch would change the context class, the harness classifies the transition:
| Classification | Action | Example |
|---|---|---|
| Compatible | Auto-reroute (silent) | Legion → Clan (1 class, within floor) |
| Compatible with Compaction | Auto-compact if safe | Floor bridgeable, no pin crossed |
| Degrading | Operator confirmation required | Legion → Squad (3-class drop) |
| Ineligible | Excluded from candidates | Tier mismatch, thinking constraint |
Failure Recovery
Upstream failures are classified into recovery actions:
- Retryable flake (5xx, timeout) → retry same model once
- Rate limit (429) → cooldown provider, failover to alternative
- Auth failure (401/403) → surface to operator
- Context overflow → handled by compaction system
- User abort → no recovery action (not a failure)
Route Matrix
A reviewed snapshot of provider/model context ceilings is embedded in the binary at compile time. Runtime routing never trusts live provider claims. The matrix is updated through a scheduled refresh pipeline that detects drift, classifies changes, and requires human review for context decreases or route removals.