altair
altair copied to clipboard
serialization of expr in parameter
As is mentioned here: https://github.com/altair-viz/altair/pull/2528#issuecomment-1077773120. I've the feeling when an expr
is defined within a parameter
it is serialized into a debatable VL-spec.
Current behavior also leads to not needing the add_parameter
, when an expr
is defined, as mentioned here: https://github.com/altair-viz/altair/pull/2528#issuecomment-1078171474.
BTW, in VL one can write: "thetaOffset": {"expr": "-PI/length(data('MY_DATA'))"}
, but in Altair one can not write alt.expr(-PI/length(data('source_wind')))
.
Hi @mattijn, in the lines you identified in the other PR, if we add an elif
condition so they become
if self.param.expr is Undefined:
return {"expr": self.name}
elif isinstance(self.param.expr, str):
return {"expr": self.param.expr}
else:
return {"expr": repr(self.param.expr)}
that seems to fix the issue. If I understand correctly, the main issue was repr
turning '
into \'
, but I haven't looked super carefully. I ran the tests and didn't get any errors, but I should look around and see if it broke any of the existing parameter functionality.
This change produces
"thetaOffset": {
"expr": "-PI/length(data('source_wind'))"
}
in the serialization. Are you confident that the following would be preferred?
"thetaOffset": {
"expr": "parameter027"
}
I think my question about which "expr"
we want is very related to your "we don't need to add the parameter" comment. Of course if we use
"thetaOffset": {
"expr": "parameter027"
}
then the add_parameter
will be required.
I think I prefer:
theta_shift_as_par = alt.parameter(expr=(-PI/length(data('source_wind'))))
theta_shift_as_expr = alt.expr(-PI/length(data('source_wind'))) # not yet possible
chart = alt.Chart(source_wind).mark_arc(
tooltip=True,
thetaOffset=theta_shift_as_par # serialize to parameter name
thetaOffset=theta_shift_as_expr # serialize directly to expr
).encode(
theta='winddirection:N',
color='label:N'
).add_parameter(theta_shift_as_par)
chart#.to_dict()
The repr
issue will solve when entering the expression as string, but I since then found out I should import the relevant expressions from from altair.expr import X, Y, Z
and use these to build up the expressions.
Could you post an example of how you make the above parameter using from altair.expr import X, Y, Z
? My memory is a little fuzzy, but I remember trying to circumvent altair.expr
as much as possible.
import altair as alt
from altair.expr import PI, length, data
data_wind = [
{'winddirection': 0, 'label': 'North'},
{'winddirection': 90, 'label': 'East'},
{'winddirection': 180, 'label': 'South'},
{'winddirection': 270, 'label': 'West'},
]
source_wind = alt.DataSource(alt.InlineData(values=data_wind, name='source_wind'))
theta_shift_as_par = alt.parameter(expr=-PI/length(data('source_wind')))
#theta_shift_as_expr = alt.expr(-PI/length(data('source_wind')))
chart = alt.Chart(source_wind).mark_arc(
tooltip=True,
thetaOffset=theta_shift_as_par # should serialize to parameter name?
#thetaOffset=theta_shift_as_expr # should serialize directly to expr?
).encode(
theta='winddirection:N',
color='label:N'
)#.add_parameter(theta_shift_as_par) # when wanted as parameter name
chart.to_dict()
{'$schema': 'https://vega.github.io/schema/vega-lite/v5.2.0.json', 'config': {'view': {'continuousHeight': 300, 'continuousWidth': 400}}, 'data': {'name': 'source_wind', 'values': [{'label': 'North', 'winddirection': 0}, {'label': 'East', 'winddirection': 90}, {'label': 'South', 'winddirection': 180}, {'label': 'West', 'winddirection': 270}]}, 'encoding': {'color': {'field': 'label', 'type': 'nominal'}, 'theta': {'field': 'winddirection', 'type': 'nominal'}}, 'mark': {'thetaOffset': {'expr': "((-PI) / length(data('source_wind')))"}, 'tooltip': True, 'type': 'arc'}}
https://colab.research.google.com/drive/1-jFoG2BOYVUnMzAhKhaLQLTU6dFpeDWI?usp=sharing
@jakevdp do you agree with @mattijn's suggestion, that theta_shift_as_par
(as defined below) should serialize using the parameter name, as in "expr": "parameter027"
, and that theta_shift_as_expr
should serialize so that the formula gets spelled out, as in "expr": "-PI/length(data('source_wind'))"
? If that sounds good to you, I will see how much of that I can implement and will report where I get stuck.
import altair as alt
from altair.expr import PI, length, data
theta_shift_as_par = alt.parameter(expr=(-PI/length(data('source_wind'))))
theta_shift_as_expr = alt.expr(-PI/length(data('source_wind'))) # not yet possible
chart = alt.Chart(source_wind).mark_arc(
tooltip=True,
thetaOffset=theta_shift_as_par # serialize to parameter name
thetaOffset=theta_shift_as_expr # serialize directly to expr
).encode(
theta='winddirection:N',
color='label:N'
).add_parameter(theta_shift_as_par)
chart#.to_dict()
Maybe @domoritz can shed a light on this? What is the proper way to adopt for altair? Will expr
continue to co-exist next to param
in VL grammar and should it be treated separately?
I'm not following this whole conversation but here is my take (let me know if you want to to elaborate on anything). We plan to support and keep supporting expr
in many places like
{
"params": [
{ "name": "cornerRadius", "value": 0,
"bind": {"input": "range", "min": 0, "max": 50, "step": 1} }
],
"data": {
"values": [
{"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43},
{"a": "D", "b": 91}, {"a": "E", "b": 81}, {"a": "F", "b": 53},
{"a": "G", "b": 19}, {"a": "H", "b": 87}, {"a": "I", "b": 52}
]
},
"mark": {
"type": "bar",
"cornerRadius": {"expr": "cornerRadius"}
},
"encoding": {
"x": {"field": "a", "type": "nominal", "axis": {"labelAngle": 0}},
"y": {"field": "b", "type": "quantitative"}
}
}
expr
supports expression and they can refer to params but usually you would use a param that is a simple value. A param may be a composed object in the implementation (e.g. selection params) so we do support param
in some places where we can automatically resolve how they should be have (e.g. filter a dataset by the interval defined in a selection param).
{
"data": {"url": "data/cars.json"},
"vconcat": [{
"params": [{"name": "brush", "select": "interval"}],
"mark": "point",
"encoding": {
"x": {"field": "Horsepower", "type": "quantitative"},
"y": {"field": "Miles_per_Gallon", "type": "quantitative"}
}
}, {
"transform": [
{"filter": {"param": "brush"}}
],
"mark": "point",
"encoding": {
"x": {
"field": "Acceleration", "type": "quantitative",
"scale": {"domain": [0,25]}
},
"y": {
"field": "Displacement", "type": "quantitative",
"scale": {"domain": [0, 500]}
}
}
}]
}
I'm a little stuck with how to proceed @mattijn @jakevdp @joelostblom
Say we define a parameter like @mattijn did above
theta_shift = alt.parameter(expr=(-PI/length(data('source_wind'))))
There is currently no real difference between that object in Altair and an object like 5+theta_shift
(just an extra 5+
in the expr
).
The good thing about the current setup is that 5+theta_shift
can be referred to without explicitly making a new parameter. But @mattijn points out some unintended consequences, like for example theta_shift
can itself be referred to in a chart without explicitly adding that parameter to the chart.
Should we make it so that 5+theta_shift
is fundamentally different from theta_shift
, so for example 5+theta_shift
can only be used in a chart if theta_shift
has been explicitly added to the chart? It could either be a new object or there could be some Parameter attribute that would differentiate these cases.
I am also tempted by the two-line temporary solution that I mentioned above https://github.com/altair-viz/altair/issues/2573#issuecomment-1079303167 which would get @mattijn's original chart working (and doesn't seem to break anything that currently works).
I'm happy for any suggestions!
Thanks @domoritz, its a bit of a semantic discussion. The parts you mentioned are clear. The issue here is about the exprs
. These can be defined:
- within a scope of a parameter or
- outside a parameter.
I think if 1 is the case it always should be referred by the parameter name (not happening at the moment in altair). For 2, if exprs
are allowed to be defined outside the scope of a parameter, then altair should introduce something for this.
Hi @mattijn I made an attempt to resolve these issues in #2591. Please take a look and let me know any comments!
I mentioned before that this was not possible:
expr_ref_type = alt.expr(-PI/length(data('source_wind')))
But I just found out that within VL this is called an ExprRef
, so this is possible:
expr_ref_type = alt.ExprRef(-PI/length(data('source_wind')))
expr_ref_type
ExprRef({ expr: ((-PI) / length(data('source_wind'))) })
So one can currently use an expression in-line, without defining an expression within a parameter as such:
import altair as alt
from altair.expr import PI, length, data
data_wind = [
{'winddirection': 0, 'label': 'North'},
{'winddirection': 90, 'label': 'East'},
{'winddirection': 180, 'label': 'South'},
{'winddirection': 270, 'label': 'West'},
]
source_wind = alt.DataSource(alt.InlineData(values=data_wind, name='source_wind'))
expr_ref_str = alt.ExprRef("-PI/length(data('source_wind'))")
expr_ref_type = alt.ExprRef(-PI/length(data('source_wind')))
chart = alt.Chart(source_wind).mark_arc(
tooltip=True,
thetaOffset=expr_ref_type # <--
).encode(
theta='winddirection:N',
color='label:N'
)
chart.to_dict()
{'$schema': 'https://vega.github.io/schema/vega-lite/v5.2.0.json', 'config': {'view': {'continuousHeight': 300, 'continuousWidth': 400}}, 'data': {'name': 'source_wind', 'values': [{'label': 'North', 'winddirection': 0}, {'label': 'East', 'winddirection': 90}, {'label': 'South', 'winddirection': 180}, {'label': 'West', 'winddirection': 270}]}, 'encoding': {'color': {'field': 'label', 'type': 'nominal'}, 'theta': {'field': 'winddirection', 'type': 'nominal'}}, 'mark': {'thetaOffset': {'expr': "((-PI) / length(data('source_wind')))"}, # <-- 'tooltip': True, 'type': 'arc'}}
But alt.ExprRef()
does not support operand functions (one cannot do expr_ref_str + expr_ref_str
).
Not sure if we can or should introduce a shortcut to create an ExprRef
through alt.expr()
. I like it, but alt.expr
is a module and I am not sure if we can follow a similar strategy as alt.datum()
(as defined here)