vega
vega copied to clipboard
Should Date parsing parse integer as year by default?
Date parsing currently use integer as timestamp for parsing by default.
However, a common problem is that people actually store year as integer way more often than timestamps. While users can adjust the parse to "%Y"
, most people don't know they have to do so.
Thus, I wonder if we should adjust the default (or be smarter about date type inference in Vega).
cc: @jakevdp
A couple thoughts:
-
If we do this, doesn't it preclude the use of timestamps as inputs? (Yes, we could check the value and use timestamps for larger integers, but this seems arbitrary, brittle, and in some cases potentially infuriating.)
-
Your suggestion would seem to boil down to changing vega-util's
toDate
method. FWIW, the JS built-inDate.parse
maps a single number (either as a JS number, or a string) to the first day of that year. By extension, I believe Vega should do the same for strings like"2019"
. -
Integers will by default be parsed as number values (not Dates) by Vega. So for number input one would already need to specify a Date type explicitly, right? I'm assuming the primary issue is that in VL/Altair one might indicate a temporal type and have numbers treated as dates. Is that right?
-
Is there a reason this can't be handled at the VL level?
Is there a reason this can't be handled at the VL level?
Vega-Lite doesn't have access to the data, so it only knows that the data is temporal, but doesn't know how the data looks like at all.
Integers will by default be parsed as number values (not Dates) by Vega. So for number input one would already need to specify a Date type explicitly, right? I'm assuming the primary issue is that in VL/Altair one might indicate a temporal type and have numbers treated as dates. Is that right?
Yep
Your suggestion would seem to boil down to changing vega-util's toDate method. FWIW, the JS built-in Date.parse maps a single number (either as a JS number, or a string) to the first day of that year. By extension, I believe Vega should do the same for strings like "2019".
Given that we at least parse "2019"
as year, it might make sense to just document that integer = timestamp by default, esp. given that a smarter logic can be too brittle and precludes timestamp inputs.
Technically this would require a breaking change, so we should probably shelve it for the time being. That said, in the future we could consider updating the date parsing as you suggest here and require a parsing specifier of date:"%Q"
for timestamp support (which I don't think is a particularly common need).
In any case, I just tested and d3.timeParse('%Q')(Date.now())
behaves as one would expect.
Should we re-open and put it in 6.0 milestone then?
I do really like the suggestion of parsing integers as years by default. As already mentioned above, I think this is a more common date storage format than integers as timestamps. It would be convenient to be able to simply specify that the data type is temporal in VL and have integers rendered as year. Doing that currently yields some odd results and one must recast the column as a string to get the desired behavior:
Would you consider re-opening this and implementing this feature in Vega? We could change this on the Altair side of things and recast ints as str automatically when the temporal data type is used, but would like to avoid being inconsistent with VL here is possible.
We're also discussing this in a few other places:
- https://github.com/altair-viz/altair/discussions/3140
- https://github.com/hex-inc/vegafusion/issues/402
reopened and moved into 6.0