Streaming and Chunking

Zeus uses Server-Sent Events (SSE) to stream the Agent’s reasoning process and tool calls to the frontend in real time. This page describes the complete streaming pipeline, event types, and chunking behavior.

Streaming Pipeline

The streaming pipeline uses end-to-end SSE pass-through: the Python backend generates events -> the Next.js API proxies and forwards them -> the frontend StreamProcessor consumes them.

Event Types

Core Events

SSE Event	Trigger	Key Fields
`text`	Each token output by the LLM	`content`, `role`
`tool_call`	LLM decides to invoke a tool	`tool_name`, `parameters`, `requires_approval`
`tool_call_result`	Tool execution completes	`tool_name`, `result`, `is_error`
`complete`	Agent execution finishes	`content`, `summary`
`error`	An exception occurs	`error`, `error_code`, `details`
`token_usage`	After an LLM call ends	`prompt_tokens`, `completion_tokens`

Sandbox tool execution results (sandbox_exec_py, sandbox_exec_sh, etc.) are returned via the standard tool_call_result event and do not use a separate event type.

Event Mapping

Mapping from DeepAgents framework internal events to SSE messages:

Tool Call ID Queue

To correctly match on_tool_end events with their corresponding tool calls, the runtime maintains a FIFO queue (grouped by tool name, storing tool_call_id values). IDs are enqueued during on_chat_model_end and dequeued for matching during on_tool_end.

Frontend StreamProcessor

The frontend handleStreamMessage() consumes the SSE stream and routes events to the appropriate state management:

Tool Call Streaming

Tool calls support two-phase streaming: tool_call_chunk events stream the arguments in real time, then a tool_call event delivers the complete parameters.

Expected Event Order (Frontend)

tool_call_chunk  (first chunk — creates streaming preview card)
tool_call_chunk  (subsequent chunks — appends argument text)
tool_call_chunk  (continues appending...)
tool_call        (complete parameters — replaces streaming card with final version)
tool_call_result (tool execution result)

SSE Event	Trigger	Key Fields
`tool_call_chunk`	LLM streams tool arguments	`tool_call_id`, `tool_name`, `args_chunk`, `index`
`tool_call`	Arguments are complete	`tool_name`, `parameters` (complete dict)

stream-processor.ts creates/updates a streaming preview card (with streamingArgs) on each tool_call_chunk, then replaces it with the final card carrying complete parameters when the tool_call event arrives.

ai-backend (LangChain)

LangChain events naturally match this order:

on_chat_model_stream → tool_call_chunks → emits tool_call_chunk (streaming fragments)
on_chat_model_end    → tool_calls ready → emits tool_call (complete parameters)

At on_chat_model_end, the LLM has finished generating all arguments, so tc.get("args", {}) returns the fully parsed dict.

Video Pipeline SSE Events

When project_type === "video", the backend emits additional SSE events to drive the Video Workspace UI in real time.

SSE Event	Direction	Key Fields	Purpose
`video_storyboard_init`	Backend → Frontend	`storyboard_id`, `elements[]`, `segments[]`, `audio_items[]`	Push initial storyboard structure to workspace
`video_element_update`	Backend → Frontend	`storyboard_id`, `element_id`, `asset`	Update element with new asset (image)
`video_shot_progress`	Backend → Frontend	`storyboard_id`, `shot_id`, `progress` (0-100)	Real-time shot generation progress
`video_shot_complete`	Backend → Frontend	`storyboard_id`, `shot_id`, `asset`	Shot video generation complete
`video_audio_complete`	Backend → Frontend	`storyboard_id`, `audio_id`, `asset`	Audio generation complete

These events are handled by stream-processor.ts which updates videoStore (Zustand), triggering reactive UI updates in the Storyboard and Timeline panels. The projectType is propagated from the frontend’s ChatInterface through the SSE request body to the AI backend’s agent.py context, which conditionally loads the VIDEO.md prompt layer.

Chunking & Batching

Message Batching

The frontend merges rapid successive text updates, flushing UI updates in batches at approximately 60fps to avoid excessive re-rendering.

Virtual Scrolling

Long conversations use virtual scrolling, rendering only the message cards within the visible viewport to improve scroll performance.

Selective State Persistence

Only essential state is persisted (such as sessionId and messageIds); full message content is loaded on demand from the server.

​Streaming Pipeline

​Event Types

​Core Events

​Event Mapping

​Tool Call ID Queue

​Frontend StreamProcessor

​Tool Call Streaming

​Expected Event Order (Frontend)

​ai-backend (LangChain)

​Video Pipeline SSE Events

​Chunking & Batching

​Message Batching

​Virtual Scrolling

​Selective State Persistence