chainlit icon indicating copy to clipboard operation
chainlit copied to clipboard

Time are messed up in Chainlit

Open asvishnyakov opened this issue 7 months ago • 9 comments

Describe the bug https://github.com/Chainlit/chainlit/blob/0e863aee98c030b85d67e0d9ed8ad6f532967820/backend/chainlit/data/chainlit_data_layer.py#L374-L377

Saves in UTC (with Z), but https://github.com/Chainlit/chainlit/blob/0e863aee98c030b85d67e0d9ed8ad6f532967820/backend/chainlit/data/chainlit_data_layer.py#L635

reads in local time (without Z)

and when system tries to save it again we get exception

2025-09-03 22:00:48 - Task exception was never retrieved
future: <Task finished name='Task-1150' coro=<ChainlitDataLayer.update_step() done, defined at /Users/dmitriy/Documents/GitHub/cookbook/resume-chat/.venv/lib/python3.13/site-packages/chainlit/data/utils.py:10> exception=ValueError("time data '2025-09-04T02:00:42.164000' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'")>
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/tasks.py", line 304, in __step_run_and_handle_result
    result = coro.send(None)
  File "/Users/dmitriy/Documents/GitHub/cookbook/resume-chat/.venv/lib/python3.13/site-packages/chainlit/data/utils.py", line 25, in wrapper
    return await method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dmitriy/Documents/GitHub/cookbook/resume-chat/.venv/lib/python3.13/site-packages/chainlit/data/chainlit_data_layer.py", line 388, in update_step
    await self.create_step(step_dict)
  File "/Users/dmitriy/Documents/GitHub/cookbook/resume-chat/.venv/lib/python3.13/site-packages/chainlit/data/utils.py", line 25, in wrapper
    return await method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dmitriy/Documents/GitHub/cookbook/resume-chat/.venv/lib/python3.13/site-packages/chainlit/data/chainlit_data_layer.py", line 368, in create_step
    timestamp = datetime.strptime(created_at, ISO_FORMAT)
  File "/opt/homebrew/Cellar/[email protected]/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/_strptime.py", line 674, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
                                    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/_strptime.py", line 453, in _strptime
    raise ValueError("time data %r does not match format %r" %
                     (data_string, format))
ValueError: time data '2025-09-04T02:00:42.164000' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'

when returned back to data layer and it tries to update thread step

To Reproduce See #2486

Expected behavior UTC should be saved returned

asvishnyakov avatar Sep 04 '25 03:09 asvishnyakov

This is a known issue: Chainlit's data layer returns datetime strings for created_at without the 'Z' suffix, which indicates UTC. This causes a ValueError when the code tries to parse the string using the format %Y-%m-%dT%H:%M:%S.%fZ, since Python's datetime.isoformat() does not append 'Z' by default. Both DynamoDB and SQLAlchemy return the correct UTC format with the 'Z' suffix, but Chainlit does not, leading to failures when updating thread steps. The expected behavior is that all timestamps should be formatted with the 'Z' suffix to indicate UTC, matching the ISO format used for parsing. This bug is tracked in issue #2491 and issue #2085. The fix is to ensure all timestamps returned by Chainlit's data layer are formatted with the 'Z' suffix.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Sep 04 '25 03:09 dosubot[bot]

Oh wow.

@DmitriyAlergant thanks for dragging me down this rabbit hole.

So basically, timezone handling in Chainlit is completely messed up.

First, there’s the bug described above, which strips the Z from proper UTC time and by so converts it to local time in steps (messages).

Then:

  • Messages, steps & callbacks use UTC
  • For threads, the Chainlit data layer uses the local server time.
  • And the cherry on top: the SQLAlchemy and DynamoDB data layers add a Z to the local server “now” time for threads, effectively creating dates and times in the future.

The "ideal" all-UTC implementation is in #2490. However, since there are no migrations, it would be a breaking change, and I don't know how to resolve that.

@hayescode @sandangel Thoughts?

asvishnyakov avatar Sep 04 '25 06:09 asvishnyakov

@asvishnyakov good find! Everytime I go deep on something I find some pretty extensive tech debt :(

I'm working on #2469 to unify the base shared components used throughout plus the data layers. Things like created_at should be set in one place but as you point out they're set in many places. I'm thinking something like this:

https://github.com/Chainlit/chainlit/blob/feat%2Fsqlmodel/backend%2Fchainlit%2Fmodels%2Fuser.py#L41

I'm also introducing alembic versioned migrations here too so we could migrate everyone automatically this way. My preference would be to wait for this as this will also introduce a breaking change because I plan to delete the other SQL data layers, not add yet another one. dynamoDB will remain, and be updated for the new components (and simplified since pydantic will easily handle the serialization). So sqlmodel and dynamoDB stay. We need to talk about LiteralAI too because I'd like to delete that too. I'm not sure how many people are hosting their own LiteralAI but I'm guessing not many since that's the reason they're bankrupt...

hayescode avatar Sep 04 '25 12:09 hayescode

@hayescode Yeah, I know the universal data layer with migrations is our holy cow - I’ll join as soon as I can. But what do we do before that? :)

Maybe @DmitriyAlergant is right, and it’s better to introduce a flexible parser for now and support this whole zoo. Or maybe it’s not such a big deal, and we should just replace everything with UTC now, even if it messes up users’ thread order for about a day.

What options come to your mind?

asvishnyakov avatar Sep 04 '25 17:09 asvishnyakov

I'm very new to the project so should not be making opinions really... From an outside perspective, I would considered bundlng together a major SQL backend cleanup + untangling timezones mess at the same time (as one, big, somewhat-breaking change); This is where you can standardize of storing utc's in datatabase, maybe even apply time conversions during the migration; etc.

While applying this parsing crutch in the interim.

DmitriyAlergant avatar Sep 04 '25 17:09 DmitriyAlergant

And sorry for dragging you into this rabbit hole:)

DmitriyAlergant avatar Sep 04 '25 17:09 DmitriyAlergant

@DmitriyAlergant That sounds great, but:

  1. We almost never have time to deliver multiple big features together (unless count them as one big feature)
  2. The project changes frequently, and large PRs quickly accumulate conflicts with the current codebase

asvishnyakov avatar Sep 04 '25 17:09 asvishnyakov

@asvishnyakov good find! Everytime I go deep on something I find some pretty extensive tech debt :(

I'm working on #2469 to unify the base shared components used throughout plus the data layers. Things like created_at should be set in one place but as you point out they're set in many places. I'm thinking something like this:

https://github.com/Chainlit/chainlit/blob/feat%2Fsqlmodel/backend%2Fchainlit%2Fmodels%2Fuser.py#L41

I'm also introducing alembic versioned migrations here too so we could migrate everyone automatically this way. My preference would be to wait for this as this will also introduce a breaking change because I plan to delete the other SQL data layers, not add yet another one. dynamoDB will remain, and be updated for the new components (and simplified since pydantic will easily handle the serialization). So sqlmodel and dynamoDB stay. We need to talk about LiteralAI too because I'd like to delete that too. I'm not sure how many people are hosting their own LiteralAI but I'm guessing not many since that's the reason they're bankrupt...

So will the standard Postgres data layer be deleted as well?

kevinnkansah avatar Sep 07 '25 19:09 kevinnkansah

It does seem a bit tricky. I don’t think we can expect a single comprehensive solution to fix everything; we have to treat it as two separate issues. First, standardize all read/store formats. I’m inclined to go without the Z. This step is easy to implement and ensures new users have the correct behavior. Second, migrate historical data. There may not be a good migration plan right now, so we can keep the current behavior for now (even if it’s incorrect).

Propose a solution that guarantees legacy data behavior remains unchanged: Use a new, distinguishable unified time format after the fix. When reading, if it’s detected to be the old format, use the old processing method

slovx2 avatar Nov 06 '25 10:11 slovx2