cpython
cpython copied to clipboard
Add methods for constructing datetime from microsecond timestamp
Feature or enhancement
Proposal:
Based on PEP 564, the time module has _ns() methods that generate timestamps of nanoseconds since the epoch. I think it might make sense for datetime.datetime to have similar methods that return/take integer timestamps with the full microsecond precision that supports. Specifically:
datetime.fromtimestamp_us(timestamp, tz=None)datetime.utcfromtimestamp_us(timestamp)datetime.timestamp_us()
I'm intending to write a change adding these methods.
If you want to reconstruct a datetime from microseconds since the Unix epoch in a provided timezone, you can currently do:
(datetime(1970, 1, 1, tzinfo=timezone.utc) + timedelta(microseconds=timestamp_us)).astimezone(tz)
This is a general solution, but it constructs two extra datetime instances and a timedelta on the way. datetime.fromtimestamp could bring that to just the one, but it takes float seconds, so there's no way to avoid losing precision (for far enough date in the future, but well short of datetime.max). One solution is to handle the micros separately:
datetime.fromtimestamp(timestamp_us // 1000000, tz).replace(microsecond=timestamp_us % 1000000)
This still requires constructing two datetime objects, and it bakes in an assumption that the timezone offset doesn't affect and isn't affected by the millisecond component. (Which probably is the case! I think. Timezone code is definitely a place where I'd prefer obsessive generality.)
Open questions:
- Would
_usor_microsbe a better suffix for the name? The latter seems more consistent withtime_ns(), but I know some object to use of "u" for "micro". - Would it be better to do
_nsinstead? This would require figuring out if / documenting whether thefromtimestamp_nsmethods round or truncate (I'd favor the latter), and that's adding behavior that's less clear from the function name alone. But it would be more consistent with thosetime_nsmethods, which might make documentation a little easier (e.g. the documentation fordatetime.fromtimestamprefers totime.time).
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
I asked about this on the python-ideas list:
https://mail.python.org/archives/list/[email protected]/thread/46PNI5Z24OTYMIHG5JMTC4JLQHT35W2B/
@yaseppochi thought it seemed a good idea.
(I didn't know about the Discourse forum before filling this out. Not quite sure if it counts as minor or not.)
I think this is a good idea generally. I'm inclined to not make it its own method if at all possible, and to avoid baking in the assumption that datetime always has microsecond precision, since #59648 is kind of high on my list of things to be fixed.
Probably the cleanest thing to do would be to make datetime.fromtimestamp accept Decimal objects, since those have arbitrary integer precision, but I think that Decimal objects themselves carry reasonably significant overhead. I am not terribly familiar with them, but I don't see an easy way to construct one with arbitrary precision after the decimal point in less time than it takes to call replace (using Python 3.11):
>>> %timeit Decimal(f"{timestamp_us // 1_000_000}.{timestamp_us % 1_000_000:06d}")
2.38 µs ± 57.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
>>> %timeit Decimal(timestamp_us) / Decimal(1_000_000)
4.7 µs ± 61.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
>>> %timeit datetime.fromtimestamp(timestamp_us / 1_000_000).replace(microsecond=timestamp_us % 1_000_000)
1.23 µs ± 27.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
I feel like the best ways to implement this would either be:
- Start accepting
Decimalobjects (maybe we should be doing this anyway), then see if we can't do anything about creating a fasterDecimalconstructor for this sort of situation. Maybe there could be aDecimal.from_integer_ratio(numerator, denominator)constructor that is faster? Not sure how much of the speed penalty ofDecimalcomes from the constructor itself. - Start accepting a tuple of two integers and just always make sure we do math on those integer such that the precision is always better than the precision of the datetime. Alternatively, we could create some sort of lightweight struct-like class that encapsulates that, like
PrecisionTimestamp(numerator, denominator). - Add a
microsecondkeyword argument that can only be used when the timestamp argument is an integer. We could expand tonanosecondwith the same semantics we decide on for #59648.
I think even if we decide to put it in its own method, the design consideration of trying to avoid baking in assumptions about the precision of the timestamp is a sound one.
(I didn't know about the Discourse forum before filling this out. Not quite sure if it counts as minor or not.)
I think this is the appropriate place to file this.
Interesting, the thread in https://github.com/python/cpython/issues/59648 suggests patches that add a fromnanoseconds method (though I think if going with that design it could use tonanoseconds as well).
The thing that got me pushed away from the datetime + timedelta solution was mostly the allocation of temporary objects, so I'm not sure using Decimal or another struct is a big improvement. Also, the idea of using Decimal for timestamps was raised in PEP 410, which was rejected.
The decision with time.time_ns makes me lean towards separate methods, though it's not the same tradeoff.
... Maybe there could be a `Decimal.from_integer_ratio(numerator, denominator)` constructor that is faster? Not sure how much of the speed penalty of `Decimal` comes from the constructor itself.
There's a decimal.Context.divide method, which also takes integers