Skip to main content
Zeus adopts a two-tier backend architecture: the Web API (Next.js) serves as the central gateway handling user management, database operations, and request forwarding for all clients; the AI Backend (FastAPI) focuses exclusively on AI Agent orchestration, tool execution, and node communication.

Overview

  • Web API is the single entry point for all clients — it handles auth, user data, session/message persistence, credits, tool configuration, and more
  • AI Backend receives forwarded agent requests from Web API and executes the Agent loop (LLM reasoning + tool calls)
  • Extension / Desktop nodes maintain a direct WebSocket connection to AI Backend’s Gateway for real-time tool execution (browser automation, desktop control)
  • Each user has an independent cloud workspace (Supabase Storage / S3) and session state (PostgreSQL Checkpointer)

Two-Tier Backend

Web API (Next.js) — Gateway & Data Layer

The Web API is a Next.js application that serves as the central API gateway for all clients. It owns the database and handles all non-AI concerns:
ResponsibilityDetails
AuthenticationBetter Auth (OAuth, email/password, SSO), JWT issuance & verification
User ManagementUser profiles, avatars, settings, agent preferences
DatabasePostgreSQL via Drizzle ORM — sessions, messages, tools, skills, knowledge bases, projects, credits, showcase
StorageSupabase Storage / S3 (MinIO) — workspace files, skill files, zeus config
Credit SystemUsage metering — check & deduct credits before forwarding agent calls
Tool ConfigurationCRUD for user tool settings (MCP servers, OAuth tokens, etc.)
Skill ManagementSkill CRUD, file upload/download, resource content prefetching
Knowledge BaseKB metadata CRUD, document upload, chunk preview
Session ManagementCreate/list/delete sessions, message persistence, event tracking
Agent ForwardingAssemble context (LLM config, tools, skills, env vars) and forward to AI Backend via HTTP SSE
Key API routes (93 total):
Route GroupExamplesDescription
/api/auth/*login, register, JWT, SSO, device-codeAuthentication
/api/sessions/*CRUD, messages, events, checkpointsSession & message management
/api/tools/*CRUD, MCP validationTool configuration
/api/skills/*CRUD, files, toggle, importSkill management
/api/knowledge-base/*CRUD, documents, search, uploadKnowledge base
/api/projects/*CRUD, resources, sessionsProject management
/api/credits/*balance, transactionsCredit system
/api/agent/invokePOST → forward to AI BackendAgent invocation (forwarding)
/api/agent/resumePOST → forward to AI BackendHITL resume (forwarding)
/api/config/*LLM, embedding, sandboxUser configuration
/api/memory/*CRUD, search, profileMemory management
/api/scheduled-tasks/*CRUD, callbacksScheduled task management
/api/deploy/*deploy, restart-devDeployment
/api/storage/*files, URLsFile storage

AI Backend (FastAPI) — Agent Execution Engine

The AI Backend is a FastAPI service dedicated to AI Agent orchestration. It receives forwarded requests from the Web API and handles all AI-related operations. The Agent Runtime is the core of the AI Backend — the other modules provide supporting infrastructure around it:
AI Backend (FastAPI)
├── Agent Runtime (Core)
│   ├── AgentService / BaseService — invoke/resume entry, context init
│   ├── DeepAgents Framework — LangGraph-based Agent loop
│   ├── Tool System — Built-in, MCP, OAuth, Connector (4 layers)
│   ├── Prompt Assembly — CORE / SOUL / TOOLS / WORKFLOW / MEMORY + dynamic injection
│   ├── Session & Checkpointer — state persistence, isolation, HITL recovery
│   └── HITL — human approval before sensitive tool execution
├── WebSocket Gateway — real-time communication with Extension / Desktop nodes
├── RAG Service — knowledge base retrieval (vector + BM25 hybrid search)
├── Sandbox — code execution environments (E2B / OpenSandbox / Daytona)
├── Scheduler — scheduled task execution (TaskIQ + RabbitMQ)
└── Channels — Feishu and other channel integrations
ResponsibilityDetails
Agent RuntimeCore Agent loop — LLM reasoning, tool planning, multi-step execution, prompt assembly, session management
WebSocket GatewayReal-time bidirectional communication with Extension/Desktop nodes
RAG ServiceKnowledge base retrieval (vector + BM25 hybrid search)
SandboxCode execution environment management (E2B / OpenSandbox / Daytona)
SchedulerTask scheduling & async execution (via TaskIQ + RabbitMQ)
ChannelsFeishu bot and other channel integrations
AI Backend API routes:
RouteDescriptionProtocol
/api/agent/invokeAgent conversation invocationHTTP POST → SSE Stream
/api/agent/resumeHITL resume executionHTTP POST → SSE Stream
/api/knowledge-base/*Knowledge base RAG operationsHTTP REST
/api/node/*Node device queriesHTTP REST
/api/scheduled-task/*Scheduled task managementHTTP REST
/api/deploy/*Coding project deploymentHTTP REST
/api/sandbox/*Sandbox managementHTTP REST
/ws/extensionBrowser extension WebSocketWebSocket
/ws/desktopDesktop application WebSocketWebSocket
/ws/webWeb client WebSocketWebSocket

Components & Data Flow

Request Flow — Agent Invocation

WebSocket Gateway

The Gateway manages all WebSocket connections through ConnectionManager, supporting three node types:
  • Extension / Desktop nodes: Each node is uniquely identified by node_id; a user can have multiple nodes
  • Web clients: Managed by user_id; the same user can have multiple Web connections
  • Tool calls: The Agent initiates JSON-RPC requests via call_tool(); the Gateway routes requests to the corresponding node and awaits responses (Future-based)

Agent Service

The Agent Service is the core orchestration layer, built on the DeepAgents framework (a higher-level wrapper over LangGraph): Service Layer:
ServiceFileResponsibility
BaseServiceservices/base.pyContext initialization, tool assembly, prompt construction, SSE event streaming
AgentServiceservices/agent.pyAgent mode invoke/resume entry point
RAGServiceservices/rag.pyKnowledge base retrieval (vector + BM25 hybrid search)
DocumentServiceservices/document.pyDocument processing and chunking
FeishuServiceservices/feishu.pyFeishu channel integration
SchedulerServiceservices/scheduler.pyScheduled task scheduling

Tool System

Zeus tools are organized into four layers:
LayerSourceRegistrationExecution Location
Built-inutils/tools/built_in/Registered directly in codeAI Backend local
MCPPassed from frontend configlangchain_mcp_adaptersMCP Server (remote)
OAuthPassed from frontend configDynamically built LangChain ToolAI Backend → OAuth API
ConnectorReported by WebSocket nodesBound via SessionManagerRemote nodes (Extension/Desktop)
Connector Tools call chain: Agent → ToolRouter → Gateway → WebSocket → Node → Execute → Return via same path.

Node Management

Node management is handled by three cooperating components:
ComponentDescription
NodeManagerNode registration/deregistration, heartbeat TTL (60s), periodic cleanup (30s)
SessionManagerBinds sessions to specific nodes, supports preferred_node_id specification
ToolRouterRoutes tool calls to the appropriate node based on session binding
Each user can have up to 10 nodes; nodes that miss heartbeats are automatically marked offline and deregistered.

Storage & State

StorageTechnologyOwnerPurpose
App DatabasePostgreSQL (Drizzle ORM)Web APIUsers, sessions, messages, tools, skills, credits, projects, showcase
CheckpointerPostgreSQL (PostgresSaver)AI BackendSession-level Agent state persistence, supports HITL recovery
MemoryPostgreSQL + pgvectorAI BackendLong-term memory (vectors + metadata), three-tier scoping
Knowledge BasePostgreSQL + pgvector + BM25AI BackendRAG document storage and hybrid retrieval
WorkspaceSupabase Storage / S3 (MinIO)Web APIUser files (outputs, uploads, sandbox results). Auto-selects S3 when S3_ENDPOINT is configured
CacheRedisBothWorkspace cache (5min), memory cache (10min), TaskIQ result backend
Message QueueRabbitMQ (TaskIQ)AI BackendAsync task broker for scheduled tasks, RAG processing, cloud sync

Communication Protocols

HTTP SSE (Agent Response Stream)

Agent invocations return text/event-stream. SSE event types:
Event Typedata FieldDescription
text{type, content}Streaming text tokens
tool_call{type, tool_name, tool_args, tool_call_id}Agent initiates a tool call
tool_call_result{type, tool_name, result, ...}Tool execution result
interrupt{type, tool_calls, ...}HITL interrupt, awaiting user approval
complete{type, finish_reason}Stream ended
error{type, error}Error message

WebSocket JSON-RPC 2.0 (Node Tool Calls)

Node tool calls follow the MCP (Model Context Protocol) specification: Request:
{
  "jsonrpc": "2.0",
  "id": "uuid-string",
  "method": "tools/call",
  "params": {
    "name": "browser_click",
    "arguments": { "selector": "#submit-btn" },
    "session_id": "session_abc123"
  }
}
Response (Success):
{
  "jsonrpc": "2.0",
  "id": "uuid-string",
  "result": {
    "content": [
      { "type": "text", "text": "Clicked element successfully" },
      { "type": "image", "data": "base64...", "mimeType": "image/jpeg" }
    ],
    "isError": false
  }
}
Response (Error):
{
  "jsonrpc": "2.0",
  "id": "uuid-string",
  "error": {
    "code": -32603,
    "message": "Element not found"
  }
}

WebSocket Message Types (Non JSON-RPC)

DirectiontypeDescription
Node → GatewayregisterNode registration (capabilities, tools)
Gateway → NoderegisteredRegistration confirmation
Node → GatewayheartbeatHeartbeat report (status, current_tasks)
Gateway → Nodeheartbeat_ackHeartbeat acknowledgment
Node ↔ Gatewayping / pongKeepalive
Web → Gatewayget_workflowsRequest workflow list (forwarded to Extension)
Web → Gatewayexecute_workflowExecute workflow
Node → Gatewaytask_completeWorkflow execution completed

Startup & Lifecycle

Web API Startup

The Next.js application starts automatically and serves:
  • All API routes under /api/*
  • Web frontend pages under /[locale]/*
  • Authentication via Better Auth middleware

AI Backend Startup

Key Environment Variables

Web API (Next.js):
VariableDescription
DATABASE_URLPostgreSQL connection string (app database)
BACKEND_URLAI Backend URL for agent forwarding (default: http://localhost:8000)
BETTER_AUTH_SECRETBetter Auth secret key
SUPABASE_URL / SUPABASE_SERVICE_KEYSupabase Storage configuration
S3_ENDPOINT / S3_ACCESS_KEY / S3_SECRET_KEYS3/MinIO storage (alternative to Supabase, optional)
AI Backend (FastAPI):
VariableDescription
DATABASE_URLPostgreSQL connection string (Checkpointer + pgvector)
NEXTJS_API_URLWeb API URL for callbacks (user data, config lookups)
OPENAI_API_KEY / OPENAI_BASE_URLDefault LLM configuration
REDIS_URLRedis cache and TaskIQ result backend (optional)
RABBITMQ_URLRabbitMQ message queue for async tasks (optional)
LANGCHAIN_API_KEYLangSmith tracing (optional)

Health Checks

  • AI Backend: GET /health{"status": "ok"}
  • AI Backend: GET /{"name": "Zeus Backend API", "version": "1.0.0", "status": "running"}

System Invariants

  • JWT Authentication: All Web API routes require a valid JWT Token; AI Backend receives a forwarded token from Web API
  • Session Isolation: Each session_id has independent Checkpointer state; different sessions do not interfere
  • Node Heartbeat: Nodes that miss heartbeats for over 60 seconds are automatically marked offline; the Gateway immediately deregisters nodes on disconnect
  • Tool Call Timeout: WebSocket tool calls default to 60-second timeout; workflow execution has a 300-second timeout
  • SSE Non-Replay: Agent invocation SSE streams are one-time; after disconnect, context must be restored via Checkpointer
  • Credit Gate: Web API checks and deducts credits before forwarding any agent request to AI Backend
  • Single-Instance Gateway: The current ConnectionManager is a per-process singleton; WebSocket connections are not shared across processes

Directory Structure

Web API (Next.js):
apps/web/src/
├── app/
│   ├── [locale]/                    # Page routes (i18n)
│   └── api/                         # API routes (93 routes)
│       ├── agent/                   # Agent forwarding → AI Backend
│       ├── auth/                    # Authentication
│       ├── sessions/                # Session management
│       ├── tools/                   # Tool configuration
│       ├── skills/                  # Skill management
│       ├── knowledge-base/          # Knowledge base
│       ├── projects/                # Project management
│       ├── credits/                 # Credit system
│       ├── memory/                  # Memory management
│       ├── config/                  # LLM / Embedding / Sandbox config
│       └── ...
├── db/
│   ├── schema/                      # Drizzle ORM table definitions
│   ├── model/                       # Data access layer (CRUD)
│   ├── service/                     # Business logic layer
│   └── storage/                     # Storage operations
├── lib/auth/                        # Authentication (Better Auth, JWT)
└── ...
AI Backend (FastAPI):
apps/ai-backend/src/
├── api/                             # FastAPI routing layer
│   ├── main.py                      # Entry point & lifecycle
│   ├── gateway.py                   # WebSocket gateway
│   ├── agent.py                     # Agent API (invoke/resume)
│   ├── mcp_gateway.py               # MCP protocol gateway
│   ├── skill.py                     # Skill API
│   ├── tools.py                     # Tools API
│   ├── knowledge_base.py            # Knowledge base RAG API
│   ├── node.py                      # Node query API
│   ├── scheduled_task.py            # Scheduled task API
│   ├── deploy.py                    # Deployment API
│   ├── sandbox.py                   # Sandbox API
│   └── channels/                    # Channels (Feishu)
├── services/                        # Business logic layer
│   ├── base.py                      # BaseService (Agent core)
│   ├── agent.py                     # AgentService
│   ├── rag.py                       # RAG retrieval
│   ├── document.py                  # Document processing
│   ├── sandbox.py                   # Sandbox management
│   ├── notification.py              # Notification service
│   ├── feishu.py                    # Feishu service
│   └── scheduler.py                 # Scheduled tasks
├── repository/                      # Data models & prompts
│   ├── models/                      # Pydantic Models
│   ├── prompts/                     # System Prompts (.md)
│   └── skills/                      # Skill definitions
└── utils/                           # Utility layer
    ├── core/                        # LLM, Memory, Skills, HITL
    ├── infra/                       # Backend, Checkpoint, Node, Redis, Auth
    ├── tools/                       # Tool implementations
    │   ├── built_in/                # Built-in tools
    │   └── plugin/                  # Plugin tools (Feishu)
    ├── sandbox/                     # Sandbox providers (E2B, OpenSandbox, Daytona)
    ├── knowledge_base/              # Vector storage & chunking
    └── channels/                    # Channel integrations

Agent Runtime

Runtime detailed design — Workspace, Session, Modes

Gateway Protocol

Channels & Gateway — Feishu, WebSocket node communication

Tool System

Four-layer tool system — Built-in, MCP, OAuth, Connector

File System

Storage architecture — CloudDriveBackend, Checkpoint