Skip to main content
Zeus provides error capturing and retry mechanisms at every stage of message processing, ensuring the reliability of streaming and continuity of the user experience.

Backend Error Handling

Error Classification

Error TypeDetection ConditionSSE EventUser Prompt
Authentication errorAuthentication failureerror (Unauthorized)Redirect to login
Rate limitingRate limit exceedederror (RateLimited)Prompt to retry later
Input length exceededRange of input length / InvalidParametererrorSuggest shortening input
Context window overflowcontext length / token limiterrorSuggest starting a new session
General exceptionAll other ExceptionserrorIncludes traceback details
All errors are returned to the frontend via the SSE error event type, containing error_code and details fields.

Frontend Error Handling

HTTP Status Code Handling

HTTP Status CodeMeaningHandling
401UnauthorizedRedirect to login page
403Insufficient creditsToast prompt to top up
503Backend not startedConnection error prompt
504Request timeoutTimeout prompt

Stream Error Handling

Error TypeTrigger ConditionHandling
AbortErrorUser manually cancelsSilently handled, no error displayed
Network disconnectionConnection interruptedNetwork error prompt
Parse errorMalformed SSE dataLog the error, attempt recovery

Retry Strategy

Automatic Retry

When a recoverable error is encountered, the system automatically retries using an exponential backoff strategy:
ConfigurationValue
Max retry attempts3
Initial wait1s
Backoff multiplier2x
Max wait8s

Non-Retryable Errors

The following error types do not trigger automatic retries:
  • Authentication error (401) — Requires user to re-login
  • Insufficient credits (403) — Requires user to top up
  • User-initiated cancel (AbortError) — User intent is clear

Fallback & Degradation

Event Persistence Fallback

RealtimeEventSaver fallback strategy when persistence fails:
PhaseBehavior
NormalBatch write to PostgreSQL (3 events/batch, 100ms interval)
RetryExponential backoff retry, up to 3 attempts
FallbackWrite to LocalStorage as backup
RecoveryOn next load, replay from LocalStorage back to server

Timeout Protection

TimeoutDefault ValueDescription
Agent max execution time7200s (2h)Maximum duration for a single FastAPI call
MCP server1800s (30min)Connection timeout per MCP server
HITL tool approvalConfigurableIndependently set per tool
LangGraph recursion limit999Maximum iteration count for the Agent loop