altair icon indicating copy to clipboard operation
altair copied to clipboard

should the use of `timeUnit` prevent `type: "nominal"`?

Open nicolaskruchten opened this issue 2 years ago • 2 comments

In the following case (under altair=5.0.0rc1), I'm using timeUnit() on a column containing strings, and the resulting JSON spec forces type: "nominal" which prevents Vega-Lite from inferring it's actually "temporal":

import altair as alt
from vega_datasets import data


chart = (
    alt.Chart(data.movies())
    .mark_line()
    .encode(alt.X("Release_Date").timeUnit("year"), alt.Y("Worldwide_Gross").aggregate("sum"))
)

nicolaskruchten avatar Mar 16 '23 01:03 nicolaskruchten

Thanks for reporting! This is happening because of the automatic type inference for pandas dataframes, which recognizes that the Release_Date column consists of strings and sets the type to nominal. When there is an aggregation or timeUnit present in the shorthand (like year(Release_Date)), then there is an explicit overwrite to set the type to temporal, but there is no such overwrite when using .aggregate or .timeUnit although there probably should be. This info is not included in the shorthand so it would have to be added outside of the parse_shorthand function I believe (or passed to it).

https://github.com/altair-viz/altair/blob/f9bc98fb3411212754f00d57617f5176837b92b2/altair/utils/core.py#L532-L540

Another approach to solve this would be to rely on VegaLite to infer types when there is an aggregation or timeunit present (which I think always works as expected in those cases) #2584

joelostblom avatar Mar 16 '23 23:03 joelostblom

Yeah I think the latter would probably work ok. I really like that Altair auto-detects non-nominal types from dataframes in general, but in this case it's getting in the way a bit :)

nicolaskruchten avatar Mar 16 '23 23:03 nicolaskruchten