altair
altair copied to clipboard
ENH: Make the schema validation error for non-existing params more informative
@jakevdp What do you think of something like this as a solution to #2565?
I am still working on organizing the printed parameters in neatly formatted columns so that they are easier to read, but I thought I would check in with you whether you think this solution is robust enough. It has been working fine in my limited testing and I kept the old error message as a fallback to be safe.
close #2565
I rebased on master so that the checks are passing (the many force pushes are for black and flake8, sorry about the noise). I also fixed the formatting of the parameters in the error message. It required more code than expected to make an even table with names sorted alphabetically in columns, but I think that the output is much easier to read now.
Instead of the unformated list of parameters:
SchemaValidationError: Invalid specification
altair.vegalite.v4.schema.core.Axis, validating 'additionalProperties'
altair.Axis has no parameter named 'here'
Existing parameter names are:
'aria', 'bandPosition', 'description', 'domain', 'domainCap', 'domainColor', 'domainDash',
'domainDashOffset', 'domainOpacity', 'domainWidth', 'format',
'formatType', 'grid', 'gridCap', 'gridColor', 'gridDash',
'gridDashOffset', 'gridOpacity', 'gridWidth', 'labelAlign', 'labelAngle',
'labelBaseline', 'labelBound', 'labelColor', 'labelExpr', 'labelFlush',
'labelFlushOffset', 'labelFont', 'labelFontSize', 'labelFontStyle',
'labelFontWeight', 'labelLimit', 'labelLineHeight', 'labelOffset',
'labelOpacity', 'labelOverlap', 'labelPadding', 'labelSeparation',
'labels', 'maxExtent', 'minExtent', 'offset', 'orient', 'position',
'style', 'tickBand', 'tickCap', 'tickColor', 'tickCount', 'tickDash',
'tickDashOffset', 'tickExtra', 'tickMinStep', 'tickOffset', 'tickOpacity',
'tickRound', 'tickSize', 'tickWidth', 'ticks', 'title', 'titleAlign',
'titleAnchor', 'titleAngle', 'titleBaseline', 'titleColor', 'titleFont',
'titleFontSize', 'titleFontStyle', 'titleFontWeight', 'titleLimit',
'titleLineHeight', 'titleOpacity', 'titlePadding', 'titleX', 'titleY',
'translate', 'values', 'zindex'
See the help for altair.Axis to read the full description of these parameters
The output is now a table of parameters:
altair.vegalite.v4.schema.core.Axis, validating 'additionalProperties'
altair.Axis has no parameter named 'here'
Existing parameter names are:
aria gridDashOffset labelLineHeight tickCount titleBaseline
bandPosition gridOpacity labelOffset tickDash titleColor
description gridWidth labelOpacity tickDashOffset titleFont
domain labelAlign labelOverlap tickExtra titleFontSize
domainCap labelAngle labelPadding tickMinStep titleFontStyle
domainColor labelBaseline labelSeparation tickOffset titleFontWeight
domainDash labelBound labels tickOpacity titleLimit
domainDashOffset labelColor maxExtent tickRound titleLineHeight
domainOpacity labelExpr minExtent tickSize titleOpacity
domainWidth labelFlush offset tickWidth titlePadding
format labelFlushOffset orient ticks titleX
formatType labelFont position title titleY
grid labelFontSize style titleAlign translate
gridCap labelFontStyle tickBand titleAnchor values
gridColor labelFontWeight tickCap titleAngle zindex
gridDash labelLimit tickColor
See the help for altair.Axis to read the full description of these parameters
I did some basic manual testing with all the encoding channels listed in https://altair-viz.github.io/user_guide/encoding.html#encoding-channels and it seems to work fine. Although I am not sure if there are edge cases, I hope that the fallback to the old method could guard again those, but let me know if you think of anything specific and I can try to add tests for that.
Hi @joelostblom can you synchronise your branch, https://github.com/joelostblom/altair/tree/schema-validation-error, with the main repo and maybe add a code-snippet how to this can be tested easily using eg. colab. This will help the review process. Hopefully @jakevdp is still willing to do this.
@mattijn I rebased this on the latest main branch and also structured the code so that it is easier to review. Do you think you would be able to have a look at this? I think it could be quite useful to get feedback on in the RC. I have included the rationale and an example below:
When a non-existing parameter name is used, I think it would be helpful to include the existing parameter names in the error message. Currently, when misspelling a parameter name like in the example below, it is not clear whether I made a typo or the parameter doesn't exist at all, so the immediate feedback of seeing the correct parameter names would be helpful.
import altair as alt
import pandas as pd
source = pd.DataFrame({
'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})
alt.Chart(source).mark_bar().encode(
x=alt.X("a", scale=alt.Scale(padingOuter=0.5)),
y=alt.Y("b")
)
SchemaValidationError: Invalid specification
altair.vegalite.v4.schema.core.Scale, validating 'additionalProperties'
Additional properties are not allowed ('padingOuter' was unexpected)
I also belive the error message text could be clarified a little. Especially for novices, I don't think it is immediately clear what it means that "Additional properties are not allowed". This PR updates the error message to instead read:
SchemaValidationError: Invalid specification
altair.vegalite.v5.schema.core.Scale, validating 'additionalProperties'
altair.Scale has no parameter named 'padingOuter'
Existing parameter names are:
align domain interpolate range round
base domainMax nice rangeMax scheme
bins domainMid padding rangeMin type
clamp domainMin paddingInner reverse zero
constant exponent paddingOuter
See the help for altair.Scale to read the full description of these parameters
It is possible that we could even remove the two first lines and start with altair.Scale has no... as I don't think the two first lines add any info that helps the user solve the error, what do you think?
This PR also includes a fallback to the old method of displaying error messages, so that in case the assumption in the if-statement does not hold true, things will not break and the user will just see a less helpful error message. Here is an example of when that is happening:
import altair as alt
from vega_datasets import data
source = data.cars()
alt.Chart(source).mark_boxplot(
ticks={'thickness': 2},
median={'stroke': 'black', 'strokeWidth': 2},
size=50
).encode(
x=alt.X('Miles_per_Gallon:Q', scale=alt.Scale(zero=False)),
y=alt.Y('Origin', scale=alt.Scale(zero=False)),
color=alt.Color('Origin'),
)
This looks super helpful! Haven't reviewed the code but just a quick reminder that the changes need to be moved to tools/schemapi/schemapi.py.
Thanks @binste, updated!
Just a note that it might make sense to merge this PR after #2842 and test it again. I'd expect that your changes work in even more cases as the correct additionalProperties error is raised more often, but just to be sure.
Good catch, thank you! I have updated to added the changes to the tools script as well
I think it is a really nice PR. Thanks @joelostblom. Really helpful, nice that it gives suggestions. It makes me want to make mistakes on purpose to read the suggestions..
I found one spec that gives an incorrect hint:
import altair as alt
from vega_datasets import data
cars = data.cars.url
alt.Chart(cars).mark_point().encode(
x='Acceleration:Q',
y='Horsepower:Q',
color=alt.value(1) # should be eg. alt.value('red')
)
SchemaValidationError: `Color` has no parameter named 'value'
Existing parameter names are:
shorthand bin legend timeUnit
aggregate condition scale title
bandPosition field sort type
See the help for `Color` to read the full description of these parameters
It is allowed to use alt.value() in this context, but in this case it should be a string and not a number.
Also the context description is based on the jsonschema. So I'm still linking this comment https://github.com/altair-viz/altair/pull/2842#issuecomment-1400938604 here as well.
alt.Chart(data.barley()).mark_bar().encode(
x=alt.X('variety'),
y=alt.Y('sum(yield)', stack='null'), # should be eg. stack=None
)
SchemaValidationError: 'null' is an invalid value for `stack`:
'null' is not one of ['zero', 'center', 'normalize']
'null' is not of type 'null'
'null' is not of type 'boolean'
I tried to follow the suggestion, but it was not directly clear that type 'null' actually is None.
No further comments. Congrats again on this PR!
Thank you @mattijn :smile: I will go ahead and merge this since you approved and the change that @binste suggested has also been implemented.
We can follow up on the remaining two issues separately. I noticed what you mentioned regarding alt.datum/value and it was introduced before this PR so I opened https://github.com/altair-viz/altair/issues/2913 to track it. I agree that it would be helpful to show the Python types as well and opened https://github.com/altair-viz/altair/issues/2914 for that.