altair icon indicating copy to clipboard operation
altair copied to clipboard

WIP: Lifting parameters to the top level

Open ChristopherDavisUCI opened this issue 2 years ago • 34 comments

Updated version of https://github.com/altair-viz/altair/pull/2671. I believe the functionality is the same, but the code is a little cleaner. @mattijn As always I am happy for any comments!

Here was the description of https://github.com/altair-viz/altair/pull/2671:

This is a draft version (some tests are currently failing) of the strategy suggested by @arvind and @mattijn here for dealing with parameters in multi-view Altair charts. We lift all parameters to the top level. In the case of selection parameters, we give the parameter a "views" property to indicate where the selections should be happening.

ChristopherDavisUCI avatar Sep 26 '22 21:09 ChristopherDavisUCI

@mattijn I haven't worked on the Vega-Lite issues you raised in https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790. If this code seems to be working for you (and not too complicated), I will start thinking about those issues.

Also if we decide this is working as expected, I will go through the 8 (?) failing tests and either change the tests or update my code.

ChristopherDavisUCI avatar Sep 26 '22 21:09 ChristopherDavisUCI

Great @ChristopherDavisUCI! Looks more readable. I like the changes.

You might align these definitions: https://github.com/ChristopherDavisUCI/altair/blob/liftparams/tools/generate_schema_wrapper.py#L29-L30

with the changes you introduced here: https://github.com/altair-viz/altair/blob/70682835bd4d80e31a378890f6f2274501cf2849/altair/vega/v5/schema/init.py

including a run of the generate_schema_wrapper.py, since:

!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import altair as alt
alt.SCHEMA_VERSION

gives me v5.2.0.

I checked a few of these failing tests and they seem to be related to the more recent discussion we had regarding the behaviour of layered params. These tests seems compact and interesting and I did not really see a case that require changing the tests assertion of these failing tests (excluding interactive_layered_crossfilter.py).

mattijn avatar Sep 26 '22 22:09 mattijn

Thanks @mattijn! I see now that I had updated the Vega version in the wrong place; it should be corrected now. I also updated vega-lite to 5.5.0. Does that sound like what you had in mind? (It looks like the Vega-Lite editor uses vega-lite 5.2.0.)

I'll take a closer look at the failing tests soon.

ChristopherDavisUCI avatar Sep 27 '22 05:09 ChristopherDavisUCI

@mattijn Here's one concern I have related to the failing tests. If I understand right, the failing layer chart test, for example, asserts that adding a selection parameter to a LayerChart should be the same as adding a selection parameter to the first chart in the layer list. I think that should be easy to implement, but what if a later version of Vega-Lite says these two parameter placements should both be allowed and should behave differently? Do you see what I'm worried about?

If you think that sort of hypothetical future issue could be fixed easily enough later, I'll go ahead and try to add functionality so there is more flexibility in placing selection parameters on top level charts.

ChristopherDavisUCI avatar Sep 27 '22 12:09 ChristopherDavisUCI

I think I understand what you mean. I think I've raised and discuss this in this VL-issue, https://github.com/vega/vega-lite/issues/7854#issuecomment-1244479284.

At the moment we can assume that multiple selection parameter placements within a layered chart results in potentially interference and that therefor a single layered object can only have 1 name/view combination.

I see it as a single name definition on the full layer-object, like the following in VL-json:

"layer": [
    "name":"view_0", 
    {}, 
    {}
]

But this type of structure is not allowed within VL.

We might still opt for defining this as such and then change it (including a comment) subsequently into:

"layer": [   
    {"name":"view_0"}, 
    {}
]

Then when this behavior is potentially changing in the future, it is potentially easier to adapt to these changes.

This approach might also help in presenting a user-friendly error when defining two different selection parameters in the layer or for resolving two equal defined selection parameters in the layer (see also https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790).

mattijn avatar Sep 27 '22 17:09 mattijn

Edit As I think about it, maybe the parameter should only be distributed if it doesn't have a views property already? Overall these fixes feel a little ad hoc to me...

Hi @mattijn, I fixed three of the failing tests (for concat, hconcat, vconcat) by doing the following:

  • Reverting to the previous code that, for any parameter added to the top level, distributes that parameter across the concat list.
  • Resetting some instances of the Chart _counter in the test: https://github.com/altair-viz/altair/blob/c5fe43f9123ab95e1dd25418c1ec9c6f595a4a49/altair/vegalite/v5/tests/test_api.py#L630-L638)

Does that all sound reasonable? An alternative option to resetting the _counter would be to weaken the requirements that the dictionaries be identical.

ChristopherDavisUCI avatar Sep 29 '22 22:09 ChristopherDavisUCI

I don't see a problem for resetting the _counter instance when comparing two Chart objects created after each other.

Edit As I think about it, maybe the parameter should only be distributed if it doesn't have a views property already? Overall these fixes feel a little ad hoc to me...

Hm, not sure.. Given:

base = alt.Chart("data.csv").mark_point()
selection = alt.selection_point()

Then for concatenating, this is proper VL-style:

chart1 = alt.concat(base.add_params(selection), base.add_params(selection))

This is not proper VL-style, but allowed by Altair:

chart2 = alt.concat(base, base).add_params(selection)

So, doing a redistribution there for the h/v/concat seems valid to me.

Thinking a bit more, I do agree that one cannot just blindly distribute all params to all objects within concat: Eg. this will not work:

alt.concat(alt.layer(base, base), alt.layer(base, base)).add_params(selection)

But then it should be somehow recognize to do:

alt.concat(alt.layer(base.add_params(selection), base), alt.layer(base.add_params(selection), base))

Edit: the tests I considered:

!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import altair as alt
from vega_datasets import data

def test_compound_add_selections(charttype):
    base = alt.Chart("data.csv").mark_point()
    selection = alt.selection_point()
    alt.Chart._counter = 0
    chart1 = charttype(base.add_params(selection), base.add_params(selection))
    alt.Chart._counter = 0
    chart2 = charttype(base, base).add_params(selection)
    print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
    #assert chart1.to_dict() == chart2.to_dict()

def test_repeat_add_selections():
    base = alt.Chart("data.csv").mark_point()
    selection = alt.selection_point()
    chart1 = base.add_params(selection).repeat(list("ABC"))
    chart2 = base.repeat(list("ABC")).add_params(selection)
    print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
    #assert chart1.to_dict() == chart2.to_dict()

def test_facet_add_selections():
    base = alt.Chart("data.csv").mark_point()
    selection = alt.selection_point()
    chart1 = base.add_params(selection).facet("val:Q")
    chart2 = base.facet("val:Q").add_params(selection)
    print(f'{chart1.to_dict()}\n{chart2.to_dict()}')    
    #assert chart1.to_dict() == chart2.to_dict()

def test_layer_add_selection():
    base = alt.Chart("data.csv").mark_point()
    selection = alt.selection_point()
    chart1 = alt.layer(base.add_params(selection), base)
    chart2 = alt.layer(base, base).add_params(selection)
    print(f'{chart1.to_dict()}\n{chart2.to_dict()}')
    #assert chart1.to_dict() == chart2.to_dict()    

def test_interactive_layered_crossfilter():
    source = alt.UrlData(
        data.flights_2k.url,
        format={'parse': {'date': 'date'}}
    )

    brush = alt.selection(type='interval', encodings=['x'])

    # Define the base chart, with the common parts of the
    # background and highlights
    base = alt.Chart(source).mark_bar().encode(
        x=alt.X(alt.repeat('column'), type='quantitative', bin=alt.Bin(maxbins=20)),
        #x=alt.X('distance', type='quantitative', bin=alt.Bin(maxbins=20)),    
        y='count()'
    ).properties(
        width=160,
        height=130
    )

    # gray background with selection
    background = base.encode(
        color=alt.value('#ddd')
    ).add_params(brush)

    # blue highlights on the transformed data
    highlight = base.transform_filter(brush)

    # layer the two charts & repeat
    chart = alt.layer(
        background,
        highlight,
        data=source
    ).transform_calculate(
        "time",
        "hours(datum.date)"
    ).repeat(column=["distance", "delay", "time"])
    #assert chart
    print(f'{chart.to_dict()}')

for charttype in [alt.concat, alt.hconcat, alt.vconcat]:
    print(charttype)
    test_compound_add_selections(charttype)

test_repeat_add_selections()
test_facet_add_selections()
test_layer_add_selection()
test_interactive_layered_crossfilter() 

mattijn avatar Sep 30 '22 15:09 mattijn

Hi @mattijn, I think the only failing tests at this point are the two related to test_interactive_layered_crossfilter. And as far as I understand, I don't think we know how to generate this chart with our current strategy.

Do you see a "parameters at the top level with named views" version of this Vega-Lite schema that is allowed? https://vega.github.io/vega-lite/examples/interactive_layered_crossfilter.html Or should we just resign ourselves to our strategy not working for this example at the moment?

ChristopherDavisUCI avatar Oct 02 '22 23:10 ChristopherDavisUCI

Great progress @ChristopherDavisUCI! Well done. Regarding the repeat operator, yes, let's not focus too much on this in this PR.

I tried a few things and I think this is working pretty good.

Still a question: In https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790 there were 4 scenarios from which only scenario 1 was working. Now scenario 2 also works with the latest changes, but scenario 3 & 4 do still give errors.

Do you think this is something that should be resolved within this PR?

mattijn avatar Oct 03 '22 18:10 mattijn

I had another go using repeat and params: https://github.com/vega/vega-lite/issues/6890#issuecomment-1266821631. It seems there is more possible than first anticipated (I think we want option 3 & 6, but is option 2 also acceptable here?).

mattijn avatar Oct 04 '22 11:10 mattijn

Those six examples you provided https://github.com/vega/vega-lite/issues/6890#issuecomment-1266821631 are super helpful @mattijn.

I haven't wrapped my head around the different scenarios yet. Just as a big-picture overview, do you think your option 6 will give us a way to produce this example with the params at the top level? https://vega.github.io/vega-lite/examples/interactive_layered_crossfilter.html

(At least in this PR, I think we should try to consistently have params appear at the top level.)

ChristopherDavisUCI avatar Oct 05 '22 14:10 ChristopherDavisUCI

Yes I think so. We should at least be able to get a Vega-Lite spec that compiles. The interaction and axis will not be perfect, since these VL-bug(s): https://github.com/vega/vega-lite/issues/8446, which is hopefully a duplicate of https://github.com/vega/vega-lite/issues/8348, but we should be able to get a chart that renders.

If we instead focus on the Altair version of option 6, we can test better. Here the Altair spec with repeat commented out:

!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import pandas as pd
import altair as alt
source = pd.read_csv('https://unpkg.com/[email protected]/data/weather.csv')

highlight = alt.selection_point(
    nearest=True, on="mouseover", encodings=["x"], clear="mouseout"
)
rule = (
    alt.Chart()
    .mark_rule()
    .encode(
        x=alt.X("month(date):T"),
        color=alt.value("black"),
        opacity=alt.condition(highlight, alt.value(1), alt.value(0), empty=False),
    )
)

line = (
    alt.Chart()
    .mark_line(point=True)
    .encode(
        x=alt.X("month(date):T"),
        y=alt.Y("mean(precipitation):Q"),  
        #y=alt.Y(alt.repeat('repeat'), aggregate='mean'),
        color=alt.Color("location:N"),
    )
)

alt.layer(rule, line, data=source, height=200, width=200).add_params(
    highlight
)#.repeat(repeat=["temp_max", "precipitation", "wind"])

If I add the repeat operator and change the y encoding to use the repeat, I get the following error:

SchemaValidationError: Invalid specification

        altair.vegalite.v5.schema.channels.OpacityValue, validating 'additionalProperties'

        Additional properties are not allowed ('param', 'empty' were unexpected)
        
alt.RepeatChart(...)

This should compile, so we still miss something in the alt.RepeatChart()?

mattijn avatar Oct 05 '22 19:10 mattijn

Hi @mattijn, thanks for the continued testing! I think I got the RepeatChart issue fixed. I haven't thought yet about your scenarios 3 and 4 that you mentioned in https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790 (but can confirm they're still not working). I will think about that now.

Could you please take another look and see if things are now behaving mostly as you expected?

Is it obvious to you which tests have failed in the Python 3.9 build? On my end they seem to be passing.

ChristopherDavisUCI avatar Oct 08 '22 18:10 ChristopherDavisUCI

Tests on Github Actions are failing because this PR in the altair_viewer repository is not yet merged: https://github.com/altair-viz/altair_viewer/pull/51. I test your latests changes and all seems fine to me. Nice! I noticed there is still one assert in your latest commits, you probably didn't want to include this: https://github.com/altair-viz/altair/pull/2684/commits/b79460deca9f7bb051d09956d1e8646f3c578c2e#diff-bbe4e187b18d242a366c820d023afd89041759cb96e4ec66c3f34559a72c2f9dR2332

mattijn avatar Oct 08 '22 21:10 mattijn

Thanks as always for the feedback @mattijn. I think your four scenarios from https://github.com/altair-viz/altair/pull/2671#issuecomment-1244085790 are currently working. Could you please check and see if you agree? Anything overall that seems not to be working?

ChristopherDavisUCI avatar Oct 09 '22 15:10 ChristopherDavisUCI

Nice as always @ChristopherDavisUCI! For the following layered facet chart, I still have two different errors. The first specification is a layered facet chart without parameters defined:

import altair as alt
from vega_datasets import data

iris = data.iris.url
hover = alt.selection_point(on='mouseover', nearest=True)

base = alt.Chart().encode(
    x='petalLength:Q',
    y='petalWidth:Q',
    color='species:N'  # alt.condition(hover, 'species:N', alt.value('lightgray'))
).properties(
    width=180,
    height=180,
).add_params(hover)

points = base.mark_point()

text = base.mark_text(dy=-5).encode(
    text = 'species:N',
    opacity = alt.value(1)  # alt.condition(hover, alt.value(1), alt.value(0), empty=False)
)

chart = alt.layer(points, text, data=iris).facet(
    'species:N',
)  # .add_params(hover)

chart  # .to_dict()

Returns:

SchemaValidationError: Invalid specification
        altair.vegalite.v5.api.Chart, validating 'required'
        'data' is a required property        
alt.FacetChart(...)

Not sure where I should add my data, I tried within Chart(), within layer(data=iris) and facet(data=iris), but always the same error.

And the second layered facet chart including defined parameters:

import altair as alt
from vega_datasets import data

iris = data.iris.url
hover = alt.selection_point(on='mouseover', nearest=True)

base = alt.Chart().encode(
    x='petalLength:Q',
    y='petalWidth:Q',
    color=alt.condition(hover, 'species:N', alt.value('lightgray'))  # 'species:N'
).properties(
    width=180,
    height=180,
).add_params(hover)

points = base.mark_point()

text = base.mark_text(dy=-5).encode(
    text='species:N',
    opacity=alt.condition(hover, alt.value(1), alt.value(0), empty=False)  # alt.value(1)
)

chart = alt.layer(points, text).facet(
    'species:N',
    data=iris
).add_params(hover)

chart  # .to_dict()`

Gives me:

SchemaValidationError: Invalid specification
        altair.vegalite.v5.schema.channels.Color, validating 'additionalProperties'
        Additional properties are not allowed ('param' was unexpected)        
alt.FacetChart(...)

mattijn avatar Oct 09 '22 17:10 mattijn

Thanks @mattijn! These "data missing" error messages usually mean the parameters are in the wrong place, and that seems to be the case here. I'll take a look.

ChristopherDavisUCI avatar Oct 09 '22 17:10 ChristopherDavisUCI

Hi @mattijn, luckily it seems like the same fix that worked for LayerChart also works in this case. Do your examples now seem to be working? Please keep sending any non-working examples you can come up with!

ChristopherDavisUCI avatar Oct 09 '22 18:10 ChristopherDavisUCI

Does this work for you?:

import altair as alt
from vega_datasets import data

iris = data.iris.url
hover = alt.selection_point(on='mouseover', nearest=True)

base = alt.Chart().encode(
    x='petalLength:Q',
    y='petalWidth:Q',
    color=alt.condition(hover, 'species:N', alt.value('lightgray'))
).properties(
    width=180,
    height=180,
).add_params(hover)

points = base.mark_point()

text = base.mark_text(dy=-5).encode(
    text='species:N',
    opacity=alt.condition(hover, alt.value(1), alt.value(0), empty=False)
)

chart = alt.layer(points, text, data=iris).facet(
    'species:N'
)

chart

For me not..

mattijn avatar Oct 09 '22 18:10 mattijn

For me that example is working @mattijn. (Or at least it displays without an error.)

When you run print(chart.to_json()), does the params property show up at the top level (that's where we want it)? Maybe the GitHub repo hadn't fully updated when you tested it?

ChristopherDavisUCI avatar Oct 09 '22 18:10 ChristopherDavisUCI

Ah, yes! Latest commit didn't come properly through. Yes, this works now as well!

mattijn avatar Oct 09 '22 18:10 mattijn

Great! Please keep looking for examples that don't work as expected.

ChristopherDavisUCI avatar Oct 09 '22 18:10 ChristopherDavisUCI

This is not working, it resolves as option 3 from https://github.com/vega/vega-lite/issues/6890#issuecomment-1266821631. Probably outside the scope of this PR?

import pandas as pd
import altair as alt

source_url = "https://unpkg.com/[email protected]/data/weather.csv"

highlight = alt.selection_point(
    nearest=True, on="mouseover", encodings=["x"], clear="mouseout"
)

chart = alt.Chart(source_url).mark_circle().encode(
    x=alt.X("month(date):T"),
    color=alt.Color("location:N"),
    y=alt.Y(alt.repeat(), type="quantitative", aggregate="mean"),
    size=alt.condition(highlight, alt.value(200), alt.value(50), empty=False),
).repeat(["temp_max", "precipitation", "wind"]).add_params(highlight)

chart

mattijn avatar Oct 09 '22 18:10 mattijn

Hmm, do you see what's wrong with the json that results from your example? At first glance I would have expected this json to be correct. Open the Chart in the Vega Editor

ChristopherDavisUCI avatar Oct 09 '22 18:10 ChristopherDavisUCI

Nope, reported here: https://github.com/vega/vega-lite/issues/8452. Including a grouped issue regarding all the issues we have for the repeat operator, https://github.com/vega/vega-lite/issues/8453.

mattijn avatar Oct 09 '22 19:10 mattijn

I discovered two more issues that are not working:

First, .interactive() does not work in combination with h/v-concat:

!python -m pip install git+https://github.com/ChristopherDavisUCI/altair.git@liftparams
import altair as alt

source = 'https://unpkg.com/[email protected]/data/weather.csv'

# interval = alt.selection_interval(encodings=['x'], bind='scales')
line = (
    alt.Chart(height=100)
    .mark_bar(point=True)
    .encode(
        x=alt.X("date:T"),
        y=alt.Y("precipitation:Q"),
        color=alt.Color("location:N"),
    )
)
alt.vconcat(line, line, data=source).interactive(bind_y=False)  # .add_params(interval)
AttributeError: 'VConcatChart' object has no attribute 'interactive'

Second, I get a duplicate signal name error with the following:


highlight = alt.selection_point(
    nearest=True, on="mouseover", encodings=["x"], clear="mouseout"
)
interval = alt.selection_interval(encodings=['x'], bind='scales')

line = (
    alt.Chart(height=100)
    .mark_line(point=True)
    .encode(
        x=alt.X("month(date):T"),
        y=alt.Y("mean(precipitation):Q"),
        color=alt.Color("location:N"),
    )
)  # .interactive(bind_y=False)  # add_params(interval)

rule = (
    alt.Chart()
    .mark_rule()
    .encode(
        x=alt.X("month(date):T"),
        color=alt.value("black"),
        opacity=alt.condition(highlight, alt.value(1), alt.value(0), empty=False),
    )
)

tick = (
    alt.Chart()
    .mark_tick(point=True)
    .encode(
        x=alt.X("month(date):T"),
        y=alt.Y("mean(precipitation):Q"),
        color=alt.Color("location:N"),
    )
)

alt.vconcat(line, tick + rule, data=source).add_params(
    highlight
)
Javascript Error: Duplicate signal name: "param_45_tuple"
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.

mattijn avatar Oct 12 '22 17:10 mattijn

Thanks @mattijn I will check those out! I always appreciate these bug examples.

By the way, it's slightly easier for me when you use the "url" data syntax, because then the json that's created is much more manageable. So if it's ever easy to make that switch, please do!

ChristopherDavisUCI avatar Oct 12 '22 18:10 ChristopherDavisUCI

Will do!

mattijn avatar Oct 12 '22 18:10 mattijn

Thanks for changing the data! I will try to understand the syntax so I can do that myself in the future.

Are we sure vconcat and hconcat should have an interactive method? I looked back to v3 and I see interactive defined for Chart, LayerChart, FacetChart, and RepeatChart but not VConcatChart.

ChristopherDavisUCI avatar Oct 12 '22 19:10 ChristopherDavisUCI

For me it make sense and the VL-scheme is OK with it. If defined on the concat level, the axis interaction will be done on both charts:

Open the Chart in the Vega Editor

This is different compare to when defining on each sub-chart (which works already):

Open the Chart in the Vega Editor

Even with unequal axis name and type, this is OK on the concat: Open the Chart in the Vega Editor

BTW, it's just a shortcut to the parameter variant (which works already):

interval = alt.selection_interval(encodings=['x'], bind='scales')
line = (
    alt.Chart(height=100)
    .mark_bar(point=True)
    .encode(
        x=alt.X("date:T"),
        y=alt.Y("precipitation:Q"),
        color=alt.Color("location:N"),
    )
)
alt.vconcat(line, line, data=source).add_params(interval)  # .interactive(bind_y=False)

For the second example regarding the duplicate signal, I think we had this issue before. If I change .mark_line(point=True) to .mark_line(point=False) it works. See https://github.com/vega/vega-lite/issues/7854, but that Altair spec it refers to does work now already.. (https://github.com/altair-viz/altair/pull/2528#issuecomment-992989438)

mattijn avatar Oct 12 '22 22:10 mattijn