Cascade Failover¶

The Single-Provider Risk¶

Relying on a single LLM provider creates availability risks:

Outages: Every provider experiences downtime
Rate limits: High-traffic applications hit quota limits
Regional issues: Some providers have region-specific problems
Cost optimization: Different providers may be cheaper for different use cases

Manual failover logic is error-prone and clutters application code.

Automatic Multi-Provider Failover¶

LLMCascade wraps multiple provider configurations and automatically fails over when errors occur:

Tries providers in priority order
Catches ProviderError exceptions and moves to the next provider
Returns the first successful response
Supports all LLM methods (get_response, get_json_response, get_structured_json_response)

Basic Usage¶

from majordomo_llm import LLMCascade

cascade = LLMCascade([
    ("anthropic", "claude-sonnet-4-20250514"),  # Primary: preferred provider
    ("openai", "gpt-4o"),                        # Secondary: reliable fallback
    ("gemini", "gemini-2.5-flash"),              # Tertiary: cost-effective backup
])

# Automatically tries providers in order until one succeeds
response = await cascade.get_response(
    user_prompt="Summarize this document",
    system_prompt="Be concise.",
)

print(response.content)

Failover Behavior¶

Request sent to Anthropic
If Anthropic returns ProviderError (rate limit, outage, etc.), try OpenAI
If OpenAI fails, try Gemini
If all fail, raise the last ProviderError

Important: Only ProviderError triggers failover. Application-level errors (bad prompts, validation failures) are not retried.

Strategy Tips¶

Diversify vendors: Don't use multiple models from the same provider
Consider capabilities: Ensure fallback models support your use case (structured outputs, context length, etc.)
Monitor which provider serves requests: Log the provider used for capacity planning

# Each provider has built-in retries (3 attempts with exponential backoff)
# Cascade adds cross-provider resilience on top of per-provider retries

Next Steps¶

See the Cascade Failover recipe for more configuration examples.