langgraph icon indicating copy to clipboard operation
langgraph copied to clipboard

thread_id too long for Postgres checkpoint

Open Freezaa9 opened this issue 3 months ago • 9 comments

Checked other resources

  • [x] This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • [x] I added a clear and detailed title that summarizes the issue.
  • [x] I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • [x] I included a self-contained, minimal example that demonstrates the issue INCLUDING all the relevant imports. The code run AS IS to reproduce the issue.

Example Code

from langgraph.checkpoint.postgres import PostgresSaver

DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres?sslmode=disable"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    builder = StateGraph(...)
    graph = builder.compile(checkpointer=checkpointer)

too_long_thread_id = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

graph.invoke(
    {"messages": [{"role": "user", "content": "hi! i am Bob"}]},
    {"configurable": {"thread_id": too_long_thread_id }},
)

Error Message and Stack Trace (if applicable)

Postgres error:
there is no unique or exclusion constraint matching the ON CONFLICT specification

Description

When a thread id is absurdly long, Posgres fail to manage the thread_id index. When this happens, all calls (even when the thread_id is short) can fail.

System Info

python -m langchain_core.sys_info

Solution

Constrain the size of the thread_id ?

Freezaa9 avatar Oct 06 '25 08:10 Freezaa9

Would implementing a hash function be suitable as shortening the size might result in collision in same thread id?

rkarmaka98 avatar Oct 06 '25 10:10 rkarmaka98

It could fix the issue, but then you won't have the same thread id in PostgreSQL and complicate the ease of checkpointing usage and debugging

Freezaa9 avatar Oct 06 '25 14:10 Freezaa9

@Freezaa9 we can use a uuid that will always create the same value based on phrase. Can you explain how pgsql uses this threadid or generates it?

rkarmaka98 avatar Oct 06 '25 14:10 rkarmaka98

Update regarding the error returned by langgraph:

ProgramLimitExceeded: Index row size 4664 exceeds btree version 4 maximum 2704 for index "checkpoint_blobs_pkey" ...

_exit_ (/usr/local/lib/python3.12/site-packages/psycopgl/pipeline.py:265)

So psycopgl do return the correct error.

Would it be a good practice for langgraph to return his own error specifying that thread_id should not be that long ? Or it should solely depend on the database use for checkpointing ?

Freezaa9 avatar Oct 06 '25 14:10 Freezaa9

We should add validation in LangGraph to prevent this issue. I'm thinking we could add a warning when thread_id or checkpoint_ns is too long (>500 characters?), so users can catch this before hitting the actual PostgreSQL error.

Additionally, we should update the documentation to recommend using UUID or hash for identifiers:

import uuid
thread_id = str(uuid.uuid4())  # Recommended approach

SunHuawei avatar Oct 06 '25 15:10 SunHuawei

@SunHuawei Yes I think this is the way to go. The actual posgres limit is 2704.

So I guess we can wait for the Langgraph team to validate this approach

Freezaa9 avatar Oct 07 '25 08:10 Freezaa9

I'd prefer if we updated docs to just recommend using uuids for conversation ids. That's what we expect people to do. Docs sometimes don't use it to just reduce the size of the code snippet (i.e., leave out the uuid generation part).

eyurtsev avatar Nov 07 '25 20:11 eyurtsev

We should add validation in LangGraph to prevent this issue. I'm thinking we could add a warning when thread_id or checkpoint_ns is too long (>500 characters?),

I'd be on board with doing this if folks want to open a PR

eyurtsev avatar Nov 07 '25 20:11 eyurtsev

I'm a bit busy at the moment but I'll try to take time for a PR soon. thanks @eyurtsev

Freezaa9 avatar Nov 17 '25 12:11 Freezaa9