Core Architecture
The system uses a dual-path recall strategy combining vector semantic search + BM25 keyword matching, with results merged and ranked via RRF (Reciprocal Rank Fusion) for optimal results.Technology Stack
| Component | Technology | Description |
|---|---|---|
| Vector Store | PostgreSQL pgvector (langchain-postgres PGVector) | Vector storage + metadata filtering with cosine similarity |
| Embedding | OpenAI text-embedding-3-small (default) | Supports user-configurable embedding models |
| Keyword Retrieval | BM25 (rank-bm25 + jieba Chinese tokenizer) | 3-tier cache (Memory → Redis → PostgreSQL), lazy-loaded |
| Document Parsing | LangChain Document Loaders + SmartChunker | Multi-format support + semantically-aware chunking |
| Metadata Storage | Next.js API (Drizzle ORM + PostgreSQL) | Knowledge base / document CRUD |
| Agent Framework | DeepAgents (LangGraph) | Agentic RAG tool integration |
Data Flow Overview
Document Processing
Multi-format upload, SmartChunker intelligent chunking, processing pipeline
Vector Store
pgvector storage, user-level embedding configuration
Retrieval Strategy
Dual-path recall, RRF fusion, BM25 3-tier cache
API Reference
Knowledge base / document management APIs, Agent tools