Retrieval

Zeus’s knowledge base retrieval uses a dual-path recall strategy combining vector semantic search + BM25 keyword matching, with results merged via RRF (Reciprocal Rank Fusion) to balance semantic understanding and exact matching.

Dual-path Recall + RRF Fusion

Retrieval Flow

Stage	Description
1. Vector Retrieval	pgvector cosine similarity search, filtered by `user_id` + `knowledge_base_id` metadata, retrieves `top_k × 3` candidates
2. BM25 Retrieval	jieba Chinese tokenization + BM25Okapi keyword matching, also retrieves `top_k × 3` candidates
3. RRF Fusion	Reciprocal Rank Fusion merges both result sets. Formula: `score = vector_weight / (rank + 60) + keyword_weight / (rank + 60)`
4. Sort & Truncate	Results sorted by fusion score in descending order, returning Top K

Parameter Configuration

Parameter	Default	Description
`top_k`	`5`	Maximum number of results to return
`vector_weight`	`0.6`	Weight of vector search in RRF
`keyword_weight`	`0.4`	Weight of keyword search in RRF
`use_rerank`	`true`	When enabled, retrieves 3x candidates for fusion ranking; when disabled, returns vector results directly

Fallback Strategy

When rank-bm25 or jieba is not installed, the system automatically falls back to vector-only search mode, using only pgvector for semantic retrieval. A log message is emitted: ⚠️ BM25 dependencies unavailable, using vector-only search mode.

BM25Store 3-tier Cache

The BM25 index uses a 3-tier cache architecture, supporting lazy loading and multi-instance deployments:

Cache Tiers

Tier	Storage	Characteristics
L1	Process Memory	Fastest, Dict structure, lost on process restart
L2	Redis	Shared across instances, TTL auto-expiry, serialized storage
L3	PostgreSQL (`bm25_indexes` table)	Persistent, zlib compression + pickle serialization

Index Lifecycle

Event	Behavior
First retrieval	Lazy loading — fetches all chunks from the knowledge base, builds BM25 index, writes to all 3 cache tiers
Document added/deleted	`invalidate(kb_id)` — clears the index for that knowledge base from all 3 cache tiers
Next retrieval	Automatically rebuilds the index
Redis TTL expiry	Recovered from PostgreSQL, or rebuilt from scratch

Index Data Structure

@dataclass
class BM25IndexData:
    bm25: BM25Okapi          # BM25 index instance
    corpus: List[List[str]]   # Tokenized corpus (jieba-segmented)
    documents: List[str]      # Original document content
    doc_ids: List[str]        # Document ID list (for result mapping)
    created_at: float         # Creation timestamp
    doc_count: int            # Document count

PostgreSQL Persistence Table

CREATE TABLE IF NOT EXISTS bm25_indexes (
    kb_id VARCHAR(255) PRIMARY KEY,
    index_data BYTEA NOT NULL,        -- zlib compressed + pickle serialized
    doc_count INTEGER NOT NULL DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Multi-Knowledge-Base Retrieval

When a user associates multiple knowledge bases in a conversation, the system performs dual-path retrieval independently on each knowledge base, then merges all results sorted by score: Each knowledge base has its own independent BM25 index, with no interference between them.

RAGService

RAGService is the service-layer wrapper for retrieval, providing:

Method	Description
`search(query, knowledge_base_id, user_id, top_k, score_threshold)`	Single knowledge base retrieval
`search_multiple_knowledge_bases(query, knowledge_base_ids, user_id, top_k)`	Multi-knowledge-base retrieval + merged ranking
`format_for_context(chunks, max_chars)`	Format as LLM context (with source citations)
`format_for_tool_output(chunks, max_chunks)`	Format as Agent tool output
`get_retriever(user_id, knowledge_base_id, k)`	Get a LangChain VectorStoreRetriever

Get Started

Fundamentals

Core Capabilities

Infra

Channels

Dual-path Recall + RRF Fusion

Retrieval Flow

Parameter Configuration

Fallback Strategy

BM25Store 3-tier Cache

Cache Tiers

Index Lifecycle

Index Data Structure

PostgreSQL Persistence Table

Multi-Knowledge-Base Retrieval

RAGService

Get Started

Fundamentals

Core Capabilities

Infra

Channels

​Dual-path Recall + RRF Fusion

​Retrieval Flow

​Parameter Configuration

​Fallback Strategy

​BM25Store 3-tier Cache

​Cache Tiers

​Index Lifecycle

​Index Data Structure

​PostgreSQL Persistence Table

​Multi-Knowledge-Base Retrieval

​RAGService

Dual-path Recall + RRF Fusion

Retrieval Flow

Parameter Configuration

Fallback Strategy

BM25Store 3-tier Cache

Cache Tiers

Index Lifecycle

Index Data Structure

PostgreSQL Persistence Table

Multi-Knowledge-Base Retrieval

RAGService