LangGraph Checkpointer is used to persist Agent execution state, supporting HITL (Human-in-the-Loop) interrupt recovery and session persistence.
1. Overview
1.1 Role of Checkpointer
| Feature | Description |
|---|---|
| Session State Persistence | Saves message history, tool call stack, and current step during Agent execution |
| HITL Interrupt Recovery | Pauses execution and saves state when a tool requires human approval, then resumes after approval |
| History Replay | Navigate to any checkpoint to view or replay the execution process |
| Service Restart Recovery | Resume execution from the most recent checkpoint after a service restart |
1.2 Differences from Other Components
| Component | Stored Content | Granularity | Lifecycle |
|---|---|---|---|
| Checkpointer | Agent execution state (messages, tool call stack) | Per execution step | Session-level (thread_id) |
| Backend | File system (user artifacts) | File operations | User-level (user_id) |
| Memory | Long-term memory (user knowledge, preferences) | Concept-level | Permanent |
2. Solution Comparison
2.1 Storage Solution Options
| Solution | Latency | Durability | Cost | Use Case |
|---|---|---|---|---|
| MemorySaver | ~0.1ms | ❌ Lost on service restart | Free | Development environment |
| PostgresSaver | 5-20ms | ✅ Strong durability | Low | Recommended for production |
| RedisSaver | 1-5ms | ⚠️ Requires persistence config | Medium | High-frequency HITL scenarios |
| DrizzleCheckpointSaver (current) | 50-100ms | ✅ Durable | Low | Compatible with Next.js API |
2.2 Recommendation: PostgresSaver
Rationale:- We already have a PostgreSQL database (Supabase/Drizzle), no additional dependencies required
- Officially supported by LangGraph, stable and reliable
- Acceptable latency (5-20ms)
- Checkpoint history can be queried via SQL for easy debugging
3. Implementation Plan
3.1 Replace DrizzleCheckpointSaver with PostgresSaver
Use the official LangGraphAsyncPostgresSaver, connecting to the database via the DATABASE_URL environment variable. During initialization, setup() is automatically called to create the required database tables.
3.2 Database Table Structure
PostgresSaver automatically creates acheckpoints table with the following key fields:
thread_id- Session identifiercheckpoint_id- Checkpoint identifierparent_checkpoint_id- Parent checkpoint identifiercheckpoint- Checkpoint data (JSONB)metadata- Metadata (JSONB)created_at- Creation timestamp
(thread_id, checkpoint_id), with indexes on thread_id and created_at.
4. HITL Workflow
5. Migration Steps
5.1 Migrating from DrizzleCheckpointSaver to PostgresSaver
- Install dependencies - Install
langgraph-checkpoint-postgres>=1.0.0 - Modify base_service.py - Replace DrizzleCheckpointSaver references with AsyncPostgresSaver
- Update get_checkpointer method - Initialize using
AsyncPostgresSaver.from_conn_string() - Run database migration - PostgresSaver will automatically create the required table structure
- Remove old code - Delete DrizzleCheckpointSaver and related Next.js APIs
6. Monitoring and Debugging
Checkpoint history can be queried via SQL, including viewing all checkpoints for a session, finding checkpoints in an interrupted state, and cleaning up expired checkpoint data. It is recommended to add logging to track checkpoint load, save, and interrupt operations.7. Summary
| Item | Current State | Target State |
|---|---|---|
| Checkpointer | DrizzleCheckpointSaver (HTTP API) | PostgresSaver (direct connection) |
| Latency | 50-100ms | 5-20ms |
| Dependencies | Next.js API | No additional dependencies |
| Durability | ✅ | ✅ |
| HITL Support | ✅ | ✅ |