Failover vs Verified Failover: Why Switching Is Not Enough for LLM APIs
When an LLM API provider goes down, most reliability tools switch to a backup provider. That's failover. But here's the problem nobody talks about: the backup provider might return a broken response.
The Silent Failure Problem
Standard failover detects a provider outage and routes to the next one. But "outage" is just one failure mode. Consider these scenarios where failover succeeds but your application still breaks:
- Truncation: OpenAI returns 500 tokens instead of 2000. HTTP 200, but your user sees half a response.
- Schema drift: Anthropic returns
{"content": [...]}but DeepSeek returns{"text": "..."}. Your parser breaks. - Cost spike: You failover from GPT-4o ($2.50/1M) to Claude Opus ($15/1M). The request works, but your bill 6x'd.
- Format inconsistency: JSON output requested, but the backup model returns markdown. Your downstream pipeline chokes.
In every case, failover "worked" — you got a response from the backup provider. But the response violated your contract.
What Is Verified Failover?
Verified failover adds a validation step between provider response and application delivery. Before the response reaches your code, it's checked against a 6-dimension contract:
| Dimension | What It Checks | Example Failure |
|---|---|---|
| Schema | Response structure matches expected format | Missing required field |
| Latency | Response time within acceptable bounds | P99 spike from 2s to 30s |
| Cost | Token usage within budget | 6x cost from provider switch |
| Format | Output format matches specification | JSON requested, markdown returned |
| Semantic | Response meaning is consistent | Completely different answer |
| Compliance | Content meets safety/policy requirements | PII leak in response |
Standard Failover vs Verified Failover
Standard Failover
✗ Switches on any response, regardless of quality
✗ No contract validation — silent failures pass through
✗ Cannot detect truncation, schema drift, or cost spikes
✗ Binary: provider up or down, no health nuance
Verified Failover (Correctover)
✓ Verifies response before accepting the switch
✓ 6-dimension contract validation on every response
✓ Catches truncation, schema drift, cost spikes, format mismatches
✓ Health scoring + drift detection for proactive switching
✓ 87 self-healing rules auto-remediate failures
✓ P50 validation overhead: 22µs (negligible vs 500ms-5s API latency)
Real-World Example
Imagine you're building a legal AI tool that uses multiple LLM providers. OpenAI goes down mid-request.
# Standard failover: switches to Anthropic, returns whatever comes back
result = failover_client.chat("Summarize this contract")
# Returns: truncated at 500 tokens (HTTP 200)
# Your user sees half the summary. No error. No retry.
# Verified failover: validates before returning
result = engine.run("Summarize this contract")
# Anthropic returns truncated response
# Correctover detects: schema violation (missing conclusion section)
# Auto-heals: retries with re-prompt or switches to DeepSeek
# Returns: complete, verified response
The Cost of Unverified Failover
Based on analysis of production LLM API traffic across multiple providers:
- 3-7% of failover responses are silently broken (truncated, wrong format, schema mismatch)
- Cost spikes of 2-6x are common when switching from budget to premium providers
- Mean time to detect a silent failure without verification: hours to days
- Mean time to detect with Correctover: <1ms (validation is synchronous)
Getting Started
pip install correctover
from correctover import CorrectoverEngine
engine = CorrectoverEngine(
providers=["openai", "deepseek", "anthropic"],
failover_level="L3",
contract_validation=True
)
# Every response is now verified
result = engine.run("Your prompt here")
Also available for JavaScript:
npm install correctover
Stop trusting failover. Start verifying it.
Get Started with Correctover →