Time serialization can silently drop timezone
Initial Checks
- [X] I confirm that I'm using Pydantic V2
Description
Originally reported at https://github.com/pydantic/speedate/issues/62
If a time has a tzinfo with a non-trivial offset with e.g. depends on dst, then we can erroneously drop it.
We should almost certainly not do this. I think our best options are either to serialise the timezone name (hard but powerful) or to error (correct but frustrating for users).
A third option would be to assume today and caclulate DST offset based on that, but I think that's also bad tbh.
Example Code
from pydantic import TypeAdapter
from datetime import time
from zoneinfo import ZoneInfo
TypeAdapter(time).dump_json(time(12, 4, 5, 123456, tzinfo=ZoneInfo("America/Fortaleza")))
#> b'"12:04:05.123456"'
Python, Pydantic & OS Version
pydantic version: 2.8.0a1
pydantic-core version: 2.18.3
pydantic-core build: profile=debug pgo=false
install path: /Users/david/Dev/pydantic/pydantic/pydantic
python version: 3.12.3 (main, May 8 2024, 09:46:04) [Clang 15.0.0 (clang-1500.3.9.4)]
platform: macOS-14.4.1-arm64-arm-64bit
related packages: mypy-1.10.0 typing_extensions-4.12.1 pyright-1.1.360
commit: c6e8bcf1
Thanks @davidhewitt, I've added this to the v2.8.0 milestone!
Initial Checks
- [x] I confirm that I'm using Pydantic V2
Description
Originally reported at pydantic/speedate#62
If a time has a
tzinfowith a non-trivial offset with e.g. depends on dst, then we can erroneously drop it.We should almost certainly not do this. I think our best options are either to serialise the timezone name (hard but powerful) or to error (correct but frustrating for users).
A third option would be to assume today and caclulate DST offset based on that, but I think that's also bad tbh.
Example Code
from pydantic import TypeAdapter from datetime import time from zoneinfo import ZoneInfo TypeAdapter(time).dump_json(time(12, 4, 5, 123456, tzinfo=ZoneInfo("America/Fortaleza"))) #> b'"12:04:05.123456"'Python, Pydantic & OS Version
pydantic version: 2.8.0a1 pydantic-core version: 2.18.3 pydantic-core build: profile=debug pgo=false install path: /Users/david/Dev/pydantic/pydantic/pydantic python version: 3.12.3 (main, May 8 2024, 09:46:04) [Clang 15.0.0 (clang-1500.3.9.4)] platform: macOS-14.4.1-arm64-arm-64bit related packages: mypy-1.10.0 typing_extensions-4.12.1 pyright-1.1.360 commit: c6e8bcf1
@davidhewitt i have something little better then option 3 (even i am fine with erroring out or parsing if that will be consistent across time)
lets call option 4 instead using today in option 3 use Unix start of time as "today" i have done it here and it is preaty stable
however for some reason this may still error out for some timezones (probably tz didn't exist at midnight 1.1.1970)
and i humby ask it to be an requrement (but if imposible drop it) since calling same code with same data on different day may get different serialized result (aka tz switches to DST) and that is probably not wanted by most users
and i humby ask it to be an requrement (but if imposible drop it) since calling same code with same data on different day may get different serialized result (aka tz switches to DST) and that is > probably not wanted by most users
I fully agree with this; I think we should almost certainly not have validation which behaves differently depending on the actual date! I listed option 3 just for completeness. I agree with your suggestion of using unix epoch if we were to pick a reference point to coerce variable timezones to fixed offsets. But I still think that coercion isn't great and it would be better to either fully support IANA timezones or just error when serializing and force users to think about what they want.
Hello!
While migrating one of our FastAPI apps from pydanticv1 to pydanticv2, I also noticed that this behavior differs between the two pydantic versions.
With pydanticv1:
>>> pyenv virtualenv 3.9.5 pydanticv1
>>> pyenv activate pydanticv1
>>> pip install 'pydantic<2'
>>> ipython
In [1]: from datetime import datetime, timezone
In [2]: from pydantic import BaseModel
In [3]: class Foo(BaseModel):
...: dt: datetime
...:
In [4]: foo = Foo(dt=datetime.now(tz=timezone.utc))
In [5]: foo.json()
Out[5]: '{"dt": "2024-06-17T15:32:31.825334+00:00"}'
With pydanticv2:
>>> pyenv virtualenv 3.9.5 pydanticv2
>>> pyenv activate pydanticv2
>>> pip install 'pydantic>=2'
>>> ipython
In [1]: from datetime import datetime, timezone
In [2]: from pydantic import BaseModel
In [3]: class Foo(BaseModel):
...: dt: datetime
...:
In [4]: foo = Foo(dt=datetime.now(tz=timezone.utc))
In [5]: foo.json() # or foo.model_dump_json()
Out[5]: '{"dt":"2024-06-17T15:33:09.300525Z"}'
@davidhewitt Concerning the second option and serializing the timezone name, do you have any idea of this output?
Maybe the timezone name into brackets like this 12:04:05.123456 [America/Fortaleza] ?
I think what we should do is support RFC 9557.
It looks like the correct thing to do is to serialize the UTC offset as it was at that instant, plus the timezone name if available:
1996-12-19T16:39:57-08:00[America/Los_Angeles]
While that's in progress, here's a band-aid for anyone else having issues with Z suddenly appearing in your timestamps:
from datetime import datetime
from typing import Annotated
from pydantic import PlainSerializer
DateTime = Annotated[
datetime, PlainSerializer(lambda dt: dt.isoformat())
]
And just use DateTime instead of datetime on any field that needs it.
I've revisited this extensively today and am less sure this is a bug.
Python datetime behaves the same way:
>>> from datetime import time, timezone, timedelta
>>> from zoneinfo import ZoneInfo
>>> time(12, 4, 5, 123456, tzinfo=ZoneInfo("America/Fortaleza")).isoformat()
'12:04:05.123456'
According to datetime rules, this time object is "naive" despite having a non-none tzinfo: https://docs.python.org/3/library/datetime.html#determining-if-an-object-is-aware-or-naive
I think it's probably still unintended for users to hit this case, so an exception would still be useful but would diverge from Python and would be breaking, so I think it'd need to be an opt-in.
The full named (IANA) timezone support for datetime objects seems like an unambiguous improvement, but I think that's a separate feature request.