Skip to main content
LangGraph Checkpointer is used to persist Agent execution state, supporting HITL (Human-in-the-Loop) interrupt recovery and session persistence.

1. Overview

1.1 Role of Checkpointer

FeatureDescription
Session State PersistenceSaves message history, tool call stack, and current step during Agent execution
HITL Interrupt RecoveryPauses execution and saves state when a tool requires human approval, then resumes after approval
History ReplayNavigate to any checkpoint to view or replay the execution process
Service Restart RecoveryResume execution from the most recent checkpoint after a service restart

1.2 Differences from Other Components

ComponentStored ContentGranularityLifecycle
CheckpointerAgent execution state (messages, tool call stack)Per execution stepSession-level (thread_id)
BackendFile system (user artifacts)File operationsUser-level (user_id)
MemoryLong-term memory (user knowledge, preferences)Concept-levelPermanent

2. Solution Comparison

2.1 Storage Solution Options

SolutionLatencyDurabilityCostUse Case
MemorySaver~0.1ms❌ Lost on service restartFreeDevelopment environment
PostgresSaver5-20ms✅ Strong durabilityLowRecommended for production
RedisSaver1-5ms⚠️ Requires persistence configMediumHigh-frequency HITL scenarios
DrizzleCheckpointSaver (current)50-100ms✅ DurableLowCompatible with Next.js API

2.2 Recommendation: PostgresSaver

Rationale:
  1. We already have a PostgreSQL database (Supabase/Drizzle), no additional dependencies required
  2. Officially supported by LangGraph, stable and reliable
  3. Acceptable latency (5-20ms)
  4. Checkpoint history can be queried via SQL for easy debugging

3. Implementation Plan

3.1 Replace DrizzleCheckpointSaver with PostgresSaver

Use the official LangGraph AsyncPostgresSaver, connecting to the database via the DATABASE_URL environment variable. During initialization, setup() is automatically called to create the required database tables.

3.2 Database Table Structure

PostgresSaver automatically creates a checkpoints table with the following key fields:
  • thread_id - Session identifier
  • checkpoint_id - Checkpoint identifier
  • parent_checkpoint_id - Parent checkpoint identifier
  • checkpoint - Checkpoint data (JSONB)
  • metadata - Metadata (JSONB)
  • created_at - Creation timestamp
The primary key is a composite key of (thread_id, checkpoint_id), with indexes on thread_id and created_at.

4. HITL Workflow


5. Migration Steps

5.1 Migrating from DrizzleCheckpointSaver to PostgresSaver

  1. Install dependencies - Install langgraph-checkpoint-postgres>=1.0.0
  2. Modify base_service.py - Replace DrizzleCheckpointSaver references with AsyncPostgresSaver
  3. Update get_checkpointer method - Initialize using AsyncPostgresSaver.from_conn_string()
  4. Run database migration - PostgresSaver will automatically create the required table structure
  5. Remove old code - Delete DrizzleCheckpointSaver and related Next.js APIs

6. Monitoring and Debugging

Checkpoint history can be queried via SQL, including viewing all checkpoints for a session, finding checkpoints in an interrupted state, and cleaning up expired checkpoint data. It is recommended to add logging to track checkpoint load, save, and interrupt operations.

7. Summary

ItemCurrent StateTarget State
CheckpointerDrizzleCheckpointSaver (HTTP API)PostgresSaver (direct connection)
Latency50-100ms5-20ms
DependenciesNext.js APINo additional dependencies
Durability
HITL Support