altair
altair copied to clipboard
WIP: Lifting parameters to the top level
Updated version of https://github.com/altair-viz/altair/pull/2671. I believe the functionality is the same, but the code is a little cleaner. @mattijn As always I am happy for any comments!
Here was the description of https://github.com/altair-viz/altair/pull/2671:
This is a draft version (some tests are currently failing) of the strategy suggested by @arvind and @mattijn here for dealing with parameters in multi-view Altair charts. We lift all parameters to the top level. In the case of selection parameters, we give the parameter a "views" property to indicate where the selections should be happening.
@mattijn I haven't worked on the Vega-Lite issues you raised in https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790. If this code seems to be working for you (and not too complicated), I will start thinking about those issues.
Also if we decide this is working as expected, I will go through the 8 (?) failing tests and either change the tests or update my code.
Great @ChristopherDavisUCI! Looks more readable. I like the changes.
You might align these definitions: https://github.com/ChristopherDavisUCI/altair/blob/liftparams/tools/generate_schema_wrapper.py#L29-L30
with the changes you introduced here: https://github.com/altair-viz/altair/blob/70682835bd4d80e31a378890f6f2274501cf2849/altair/vega/v5/schema/init.py
including a run of the generate_schema_wrapper.py
, since:
!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import altair as alt
alt.SCHEMA_VERSION
gives me v5.2.0
.
I checked a few of these failing tests and they seem to be related to the more recent discussion we had regarding the behaviour of layered params. These tests seems compact and interesting and I did not really see a case that require changing the tests assertion of these failing tests (excluding interactive_layered_crossfilter.py
).
Thanks @mattijn! I see now that I had updated the Vega version in the wrong place; it should be corrected now. I also updated vega-lite to 5.5.0. Does that sound like what you had in mind? (It looks like the Vega-Lite editor uses vega-lite 5.2.0.)
I'll take a closer look at the failing tests soon.
@mattijn Here's one concern I have related to the failing tests. If I understand right, the failing layer chart test, for example, asserts that adding a selection parameter to a LayerChart should be the same as adding a selection parameter to the first chart in the layer
list. I think that should be easy to implement, but what if a later version of Vega-Lite says these two parameter placements should both be allowed and should behave differently? Do you see what I'm worried about?
If you think that sort of hypothetical future issue could be fixed easily enough later, I'll go ahead and try to add functionality so there is more flexibility in placing selection parameters on top level charts.
I think I understand what you mean. I think I've raised and discuss this in this VL-issue, https://github.com/vega/vega-lite/issues/7854#issuecomment-1244479284.
At the moment we can assume that multiple selection parameter placements within a layered chart results in potentially interference and that therefor a single layered object can only have 1 name
/view
combination.
I see it as a single name definition on the full layer-object, like the following in VL-json:
"layer": [
"name":"view_0",
{},
{}
]
But this type of structure is not allowed within VL.
We might still opt for defining this as such and then change it (including a comment) subsequently into:
"layer": [
{"name":"view_0"},
{}
]
Then when this behavior is potentially changing in the future, it is potentially easier to adapt to these changes.
This approach might also help in presenting a user-friendly error when defining two different selection parameters in the layer or for resolving two equal defined selection parameters in the layer (see also https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790).
Edit As I think about it, maybe the parameter should only be distributed if it doesn't have a views
property already? Overall these fixes feel a little ad hoc to me...
Hi @mattijn, I fixed three of the failing tests (for concat, hconcat, vconcat) by doing the following:
- Reverting to the previous code that, for any parameter added to the top level, distributes that parameter across the concat list.
- Resetting some instances of the Chart
_counter
in the test: https://github.com/altair-viz/altair/blob/c5fe43f9123ab95e1dd25418c1ec9c6f595a4a49/altair/vegalite/v5/tests/test_api.py#L630-L638)
Does that all sound reasonable? An alternative option to resetting the _counter
would be to weaken the requirements that the dictionaries be identical.
I don't see a problem for resetting the _counter
instance when comparing two Chart objects created after each other.
Edit As I think about it, maybe the parameter should only be distributed if it doesn't have a views property already? Overall these fixes feel a little ad hoc to me...
Hm, not sure.. Given:
base = alt.Chart("data.csv").mark_point()
selection = alt.selection_point()
Then for concatenating, this is proper VL-style:
chart1 = alt.concat(base.add_params(selection), base.add_params(selection))
This is not proper VL-style, but allowed by Altair:
chart2 = alt.concat(base, base).add_params(selection)
So, doing a redistribution there for the h
/v
/concat
seems valid to me.
Thinking a bit more, I do agree that one cannot just blindly distribute all params to all objects within concat
:
Eg. this will not work:
alt.concat(alt.layer(base, base), alt.layer(base, base)).add_params(selection)
But then it should be somehow recognize to do:
alt.concat(alt.layer(base.add_params(selection), base), alt.layer(base.add_params(selection), base))
Edit: the tests I considered:
!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import altair as alt
from vega_datasets import data
def test_compound_add_selections(charttype):
base = alt.Chart("data.csv").mark_point()
selection = alt.selection_point()
alt.Chart._counter = 0
chart1 = charttype(base.add_params(selection), base.add_params(selection))
alt.Chart._counter = 0
chart2 = charttype(base, base).add_params(selection)
print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
#assert chart1.to_dict() == chart2.to_dict()
def test_repeat_add_selections():
base = alt.Chart("data.csv").mark_point()
selection = alt.selection_point()
chart1 = base.add_params(selection).repeat(list("ABC"))
chart2 = base.repeat(list("ABC")).add_params(selection)
print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
#assert chart1.to_dict() == chart2.to_dict()
def test_facet_add_selections():
base = alt.Chart("data.csv").mark_point()
selection = alt.selection_point()
chart1 = base.add_params(selection).facet("val:Q")
chart2 = base.facet("val:Q").add_params(selection)
print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
#assert chart1.to_dict() == chart2.to_dict()
def test_layer_add_selection():
base = alt.Chart("data.csv").mark_point()
selection = alt.selection_point()
chart1 = alt.layer(base.add_params(selection), base)
chart2 = alt.layer(base, base).add_params(selection)
print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
#assert chart1.to_dict() == chart2.to_dict()
def test_interactive_layered_crossfilter():
source = alt.UrlData(
data.flights_2k.url,
format={'parse': {'date': 'date'}}
)
brush = alt.selection(type='interval', encodings=['x'])
# Define the base chart, with the common parts of the
# background and highlights
base = alt.Chart(source).mark_bar().encode(
x=alt.X(alt.repeat('column'), type='quantitative', bin=alt.Bin(maxbins=20)),
#x=alt.X('distance', type='quantitative', bin=alt.Bin(maxbins=20)),
y='count()'
).properties(
width=160,
height=130
)
# gray background with selection
background = base.encode(
color=alt.value('#ddd')
).add_params(brush)
# blue highlights on the transformed data
highlight = base.transform_filter(brush)
# layer the two charts & repeat
chart = alt.layer(
background,
highlight,
data=source
).transform_calculate(
"time",
"hours(datum.date)"
).repeat(column=["distance", "delay", "time"])
#assert chart
print(f'{chart.to_dict()}')
for charttype in [alt.concat, alt.hconcat, alt.vconcat]:
print(charttype)
test_compound_add_selections(charttype)
test_repeat_add_selections()
test_facet_add_selections()
test_layer_add_selection()
test_interactive_layered_crossfilter()
Hi @mattijn, I think the only failing tests at this point are the two related to test_interactive_layered_crossfilter
. And as far as I understand, I don't think we know how to generate this chart with our current strategy.
Do you see a "parameters at the top level with named views" version of this Vega-Lite schema that is allowed? https://vega.github.io/vega-lite/examples/interactive_layered_crossfilter.html Or should we just resign ourselves to our strategy not working for this example at the moment?
Great progress @ChristopherDavisUCI! Well done.
Regarding the repeat
operator, yes, let's not focus too much on this in this PR.
I tried a few things and I think this is working pretty good.
Still a question: In https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790 there were 4 scenarios from which only scenario 1 was working. Now scenario 2 also works with the latest changes, but scenario 3 & 4 do still give errors.
Do you think this is something that should be resolved within this PR?
I had another go using repeat
and params
: https://github.com/vega/vega-lite/issues/6890#issuecomment-1266821631. It seems there is more possible than first anticipated (I think we want option 3 & 6, but is option 2 also acceptable here?).
Those six examples you provided https://github.com/vega/vega-lite/issues/6890#issuecomment-1266821631 are super helpful @mattijn.
I haven't wrapped my head around the different scenarios yet. Just as a big-picture overview, do you think your option 6 will give us a way to produce this example with the params
at the top level?
https://vega.github.io/vega-lite/examples/interactive_layered_crossfilter.html
(At least in this PR, I think we should try to consistently have params
appear at the top level.)
Yes I think so. We should at least be able to get a Vega-Lite spec that compiles. The interaction and axis will not be perfect, since these VL-bug(s): https://github.com/vega/vega-lite/issues/8446, which is hopefully a duplicate of https://github.com/vega/vega-lite/issues/8348, but we should be able to get a chart that renders.
If we instead focus on the Altair version of option 6, we can test better.
Here the Altair spec with repeat
commented out:
!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import pandas as pd
import altair as alt
source = pd.read_csv('https://unpkg.com/[email protected]/data/weather.csv')
highlight = alt.selection_point(
nearest=True, on="mouseover", encodings=["x"], clear="mouseout"
)
rule = (
alt.Chart()
.mark_rule()
.encode(
x=alt.X("month(date):T"),
color=alt.value("black"),
opacity=alt.condition(highlight, alt.value(1), alt.value(0), empty=False),
)
)
line = (
alt.Chart()
.mark_line(point=True)
.encode(
x=alt.X("month(date):T"),
y=alt.Y("mean(precipitation):Q"),
#y=alt.Y(alt.repeat('repeat'), aggregate='mean'),
color=alt.Color("location:N"),
)
)
alt.layer(rule, line, data=source, height=200, width=200).add_params(
highlight
)#.repeat(repeat=["temp_max", "precipitation", "wind"])
If I add the repeat
operator and change the y
encoding to use the repeat
, I get the following error:
SchemaValidationError: Invalid specification
altair.vegalite.v5.schema.channels.OpacityValue, validating 'additionalProperties'
Additional properties are not allowed ('param', 'empty' were unexpected)
alt.RepeatChart(...)
This should compile, so we still miss something in the alt.RepeatChart()
?
Hi @mattijn, thanks for the continued testing! I think I got the RepeatChart issue fixed. I haven't thought yet about your scenarios 3 and 4 that you mentioned in https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790 (but can confirm they're still not working). I will think about that now.
Could you please take another look and see if things are now behaving mostly as you expected?
Is it obvious to you which tests have failed in the Python 3.9 build? On my end they seem to be passing.
Tests on Github Actions are failing because this PR in the altair_viewer
repository is not yet merged: https://github.com/altair-viz/altair_viewer/pull/51.
I test your latests changes and all seems fine to me. Nice!
I noticed there is still one assert
in your latest commits, you probably didn't want to include this: https://github.com/altair-viz/altair/pull/2684/commits/b79460deca9f7bb051d09956d1e8646f3c578c2e#diff-bbe4e187b18d242a366c820d023afd89041759cb96e4ec66c3f34559a72c2f9dR2332
Thanks as always for the feedback @mattijn. I think your four scenarios from https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790 are currently working. Could you please check and see if you agree? Anything overall that seems not to be working?
Nice as always @ChristopherDavisUCI! For the following layered facet chart, I still have two different errors. The first specification is a layered facet chart without parameters defined:
import altair as alt
from vega_datasets import data
iris = data.iris.url
hover = alt.selection_point(on='mouseover', nearest=True)
base = alt.Chart().encode(
x='petalLength:Q',
y='petalWidth:Q',
color='species:N' # alt.condition(hover, 'species:N', alt.value('lightgray'))
).properties(
width=180,
height=180,
).add_params(hover)
points = base.mark_point()
text = base.mark_text(dy=-5).encode(
text = 'species:N',
opacity = alt.value(1) # alt.condition(hover, alt.value(1), alt.value(0), empty=False)
)
chart = alt.layer(points, text, data=iris).facet(
'species:N',
) # .add_params(hover)
chart # .to_dict()
Returns:
SchemaValidationError: Invalid specification altair.vegalite.v5.api.Chart, validating 'required' 'data' is a required property alt.FacetChart(...)
Not sure where I should add my data, I tried within Chart()
, within layer(data=iris
) and facet(data=iris)
, but always the same error.
And the second layered facet chart including defined parameters:
import altair as alt
from vega_datasets import data
iris = data.iris.url
hover = alt.selection_point(on='mouseover', nearest=True)
base = alt.Chart().encode(
x='petalLength:Q',
y='petalWidth:Q',
color=alt.condition(hover, 'species:N', alt.value('lightgray')) # 'species:N'
).properties(
width=180,
height=180,
).add_params(hover)
points = base.mark_point()
text = base.mark_text(dy=-5).encode(
text='species:N',
opacity=alt.condition(hover, alt.value(1), alt.value(0), empty=False) # alt.value(1)
)
chart = alt.layer(points, text).facet(
'species:N',
data=iris
).add_params(hover)
chart # .to_dict()`
Gives me:
SchemaValidationError: Invalid specification altair.vegalite.v5.schema.channels.Color, validating 'additionalProperties' Additional properties are not allowed ('param' was unexpected) alt.FacetChart(...)
Thanks @mattijn! These "data missing" error messages usually mean the parameters are in the wrong place, and that seems to be the case here. I'll take a look.
Hi @mattijn, luckily it seems like the same fix that worked for LayerChart also works in this case. Do your examples now seem to be working? Please keep sending any non-working examples you can come up with!
Does this work for you?:
import altair as alt
from vega_datasets import data
iris = data.iris.url
hover = alt.selection_point(on='mouseover', nearest=True)
base = alt.Chart().encode(
x='petalLength:Q',
y='petalWidth:Q',
color=alt.condition(hover, 'species:N', alt.value('lightgray'))
).properties(
width=180,
height=180,
).add_params(hover)
points = base.mark_point()
text = base.mark_text(dy=-5).encode(
text='species:N',
opacity=alt.condition(hover, alt.value(1), alt.value(0), empty=False)
)
chart = alt.layer(points, text, data=iris).facet(
'species:N'
)
chart
For me not..
For me that example is working @mattijn. (Or at least it displays without an error.)
When you run print(chart.to_json())
, does the params
property show up at the top level (that's where we want it)? Maybe the GitHub repo hadn't fully updated when you tested it?
Ah, yes! Latest commit didn't come properly through. Yes, this works now as well!
Great! Please keep looking for examples that don't work as expected.
This is not working, it resolves as option 3 from https://github.com/vega/vega-lite/issues/6890#issuecomment-1266821631. Probably outside the scope of this PR?
import pandas as pd
import altair as alt
source_url = "https://unpkg.com/[email protected]/data/weather.csv"
highlight = alt.selection_point(
nearest=True, on="mouseover", encodings=["x"], clear="mouseout"
)
chart = alt.Chart(source_url).mark_circle().encode(
x=alt.X("month(date):T"),
color=alt.Color("location:N"),
y=alt.Y(alt.repeat(), type="quantitative", aggregate="mean"),
size=alt.condition(highlight, alt.value(200), alt.value(50), empty=False),
).repeat(["temp_max", "precipitation", "wind"]).add_params(highlight)
chart
Hmm, do you see what's wrong with the json that results from your example? At first glance I would have expected this json to be correct. Open the Chart in the Vega Editor
Nope, reported here: https://github.com/vega/vega-lite/issues/8452. Including a grouped issue regarding all the issues we have for the repeat
operator, https://github.com/vega/vega-lite/issues/8453.
I discovered two more issues that are not working:
First, .interactive()
does not work in combination with h
/v
-concat
:
!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import altair as alt
source = 'https://unpkg.com/[email protected]/data/weather.csv'
# interval = alt.selection_interval(encodings=['x'], bind='scales')
line = (
alt.Chart(height=100)
.mark_bar(point=True)
.encode(
x=alt.X("date:T"),
y=alt.Y("precipitation:Q"),
color=alt.Color("location:N"),
)
)
alt.vconcat(line, line, data=source).interactive(bind_y=False) # .add_params(interval)
AttributeError: 'VConcatChart' object has no attribute 'interactive'
Second, I get a duplicate signal name error with the following:
highlight = alt.selection_point(
nearest=True, on="mouseover", encodings=["x"], clear="mouseout"
)
interval = alt.selection_interval(encodings=['x'], bind='scales')
line = (
alt.Chart(height=100)
.mark_line(point=True)
.encode(
x=alt.X("month(date):T"),
y=alt.Y("mean(precipitation):Q"),
color=alt.Color("location:N"),
)
) # .interactive(bind_y=False) # add_params(interval)
rule = (
alt.Chart()
.mark_rule()
.encode(
x=alt.X("month(date):T"),
color=alt.value("black"),
opacity=alt.condition(highlight, alt.value(1), alt.value(0), empty=False),
)
)
tick = (
alt.Chart()
.mark_tick(point=True)
.encode(
x=alt.X("month(date):T"),
y=alt.Y("mean(precipitation):Q"),
color=alt.Color("location:N"),
)
)
alt.vconcat(line, tick + rule, data=source).add_params(
highlight
)
Javascript Error: Duplicate signal name: "param_45_tuple" This usually means there's a typo in your chart specification. See the javascript console for the full traceback.
Thanks @mattijn I will check those out! I always appreciate these bug examples.
By the way, it's slightly easier for me when you use the "url" data syntax, because then the json that's created is much more manageable. So if it's ever easy to make that switch, please do!
Will do!
Thanks for changing the data! I will try to understand the syntax so I can do that myself in the future.
Are we sure vconcat and hconcat should have an interactive
method? I looked back to v3 and I see interactive
defined for Chart, LayerChart, FacetChart, and RepeatChart but not VConcatChart.
For me it make sense and the VL-scheme is OK with it. If defined on the concat level, the axis interaction will be done on both charts:

Open the Chart in the Vega Editor
This is different compare to when defining on each sub-chart (which works already):
Open the Chart in the Vega Editor
Even with unequal axis name and type, this is OK on the concat: Open the Chart in the Vega Editor
BTW, it's just a shortcut to the parameter variant (which works already):
interval = alt.selection_interval(encodings=['x'], bind='scales')
line = (
alt.Chart(height=100)
.mark_bar(point=True)
.encode(
x=alt.X("date:T"),
y=alt.Y("precipitation:Q"),
color=alt.Color("location:N"),
)
)
alt.vconcat(line, line, data=source).add_params(interval) # .interactive(bind_y=False)
For the second example regarding the duplicate signal, I think we had this issue before.
If I change .mark_line(point=True)
to .mark_line(point=False)
it works. See https://github.com/vega/vega-lite/issues/7854, but that Altair spec it refers to does work now already.. (https://github.com/altair-viz/altair/pull/2528#issuecomment-992989438)