pydantic icon indicating copy to clipboard operation
pydantic copied to clipboard

Time serialization can silently drop timezone

Open davidhewitt opened this issue 1 year ago • 6 comments

Initial Checks

  • [X] I confirm that I'm using Pydantic V2

Description

Originally reported at https://github.com/pydantic/speedate/issues/62

If a time has a tzinfo with a non-trivial offset with e.g. depends on dst, then we can erroneously drop it.

We should almost certainly not do this. I think our best options are either to serialise the timezone name (hard but powerful) or to error (correct but frustrating for users).

A third option would be to assume today and caclulate DST offset based on that, but I think that's also bad tbh.

Example Code

from pydantic import TypeAdapter
from datetime import time
from zoneinfo import ZoneInfo

TypeAdapter(time).dump_json(time(12, 4, 5, 123456, tzinfo=ZoneInfo("America/Fortaleza")))
#> b'"12:04:05.123456"'

Python, Pydantic & OS Version

pydantic version: 2.8.0a1
        pydantic-core version: 2.18.3
          pydantic-core build: profile=debug pgo=false
                 install path: /Users/david/Dev/pydantic/pydantic/pydantic
               python version: 3.12.3 (main, May  8 2024, 09:46:04) [Clang 15.0.0 (clang-1500.3.9.4)]
                     platform: macOS-14.4.1-arm64-arm-64bit
             related packages: mypy-1.10.0 typing_extensions-4.12.1 pyright-1.1.360
                       commit: c6e8bcf1

davidhewitt avatar Jun 05 '24 11:06 davidhewitt

Thanks @davidhewitt, I've added this to the v2.8.0 milestone!

sydney-runkle avatar Jun 05 '24 13:06 sydney-runkle

Initial Checks

  • [x] I confirm that I'm using Pydantic V2

Description

Originally reported at pydantic/speedate#62

If a time has a tzinfo with a non-trivial offset with e.g. depends on dst, then we can erroneously drop it.

We should almost certainly not do this. I think our best options are either to serialise the timezone name (hard but powerful) or to error (correct but frustrating for users).

A third option would be to assume today and caclulate DST offset based on that, but I think that's also bad tbh.

Example Code

from pydantic import TypeAdapter
from datetime import time
from zoneinfo import ZoneInfo

TypeAdapter(time).dump_json(time(12, 4, 5, 123456, tzinfo=ZoneInfo("America/Fortaleza")))
#> b'"12:04:05.123456"'

Python, Pydantic & OS Version

pydantic version: 2.8.0a1
        pydantic-core version: 2.18.3
          pydantic-core build: profile=debug pgo=false
                 install path: /Users/david/Dev/pydantic/pydantic/pydantic
               python version: 3.12.3 (main, May  8 2024, 09:46:04) [Clang 15.0.0 (clang-1500.3.9.4)]
                     platform: macOS-14.4.1-arm64-arm-64bit
             related packages: mypy-1.10.0 typing_extensions-4.12.1 pyright-1.1.360
                       commit: c6e8bcf1

@davidhewitt i have something little better then option 3 (even i am fine with erroring out or parsing if that will be consistent across time)

lets call option 4 instead using today in option 3 use Unix start of time as "today" i have done it here and it is preaty stable

however for some reason this may still error out for some timezones (probably tz didn't exist at midnight 1.1.1970)

and i humby ask it to be an requrement (but if imposible drop it) since calling same code with same data on different day may get different serialized result (aka tz switches to DST) and that is probably not wanted by most users

07pepa avatar Jun 05 '24 14:06 07pepa

and i humby ask it to be an requrement (but if imposible drop it) since calling same code with same data on different day may get different serialized result (aka tz switches to DST) and that is > probably not wanted by most users

I fully agree with this; I think we should almost certainly not have validation which behaves differently depending on the actual date! I listed option 3 just for completeness. I agree with your suggestion of using unix epoch if we were to pick a reference point to coerce variable timezones to fixed offsets. But I still think that coercion isn't great and it would be better to either fully support IANA timezones or just error when serializing and force users to think about what they want.

davidhewitt avatar Jun 05 '24 15:06 davidhewitt

Hello!

While migrating one of our FastAPI apps from pydanticv1 to pydanticv2, I also noticed that this behavior differs between the two pydantic versions.

With pydanticv1:

>>> pyenv virtualenv 3.9.5 pydanticv1
>>> pyenv activate pydanticv1
>>> pip install 'pydantic<2'
>>> ipython
In [1]: from datetime import datetime, timezone

In [2]: from pydantic import BaseModel

In [3]: class Foo(BaseModel):
   ...:     dt: datetime
   ...:

In [4]: foo = Foo(dt=datetime.now(tz=timezone.utc))

In [5]: foo.json()
Out[5]: '{"dt": "2024-06-17T15:32:31.825334+00:00"}'

With pydanticv2:

>>> pyenv virtualenv 3.9.5 pydanticv2
>>> pyenv activate pydanticv2
>>> pip install 'pydantic>=2'
>>> ipython
In [1]: from datetime import datetime, timezone

In [2]: from pydantic import BaseModel

In [3]: class Foo(BaseModel):
   ...:     dt: datetime
   ...:

In [4]: foo = Foo(dt=datetime.now(tz=timezone.utc))

In [5]: foo.json()  # or foo.model_dump_json()
Out[5]: '{"dt":"2024-06-17T15:33:09.300525Z"}'

Priyansh121096 avatar Jun 17 '24 15:06 Priyansh121096

@davidhewitt Concerning the second option and serializing the timezone name, do you have any idea of this output? Maybe the timezone name into brackets like this 12:04:05.123456 [America/Fortaleza] ?

JeanArhancet avatar Jun 23 '24 15:06 JeanArhancet

I think what we should do is support RFC 9557.

It looks like the correct thing to do is to serialize the UTC offset as it was at that instant, plus the timezone name if available:

1996-12-19T16:39:57-08:00[America/Los_Angeles]

davidhewitt avatar Jun 25 '24 09:06 davidhewitt

While that's in progress, here's a band-aid for anyone else having issues with Z suddenly appearing in your timestamps:

from datetime import datetime
from typing import Annotated

from pydantic import PlainSerializer

DateTime = Annotated[
    datetime, PlainSerializer(lambda dt: dt.isoformat())
]

And just use DateTime instead of datetime on any field that needs it.

MajorDallas avatar Jan 14 '25 22:01 MajorDallas

I've revisited this extensively today and am less sure this is a bug.

Python datetime behaves the same way:

>>> from datetime import time, timezone, timedelta
>>> from zoneinfo import ZoneInfo
>>> time(12, 4, 5, 123456, tzinfo=ZoneInfo("America/Fortaleza")).isoformat()
'12:04:05.123456'

According to datetime rules, this time object is "naive" despite having a non-none tzinfo: https://docs.python.org/3/library/datetime.html#determining-if-an-object-is-aware-or-naive

I think it's probably still unintended for users to hit this case, so an exception would still be useful but would diverge from Python and would be breaking, so I think it'd need to be an opt-in.

The full named (IANA) timezone support for datetime objects seems like an unambiguous improvement, but I think that's a separate feature request.

davidhewitt avatar Sep 15 '25 18:09 davidhewitt