altair icon indicating copy to clipboard operation
altair copied to clipboard

ENH: Make the schema validation error for non-existing params more informative

Open joelostblom opened this issue 3 years ago • 1 comments

@jakevdp What do you think of something like this as a solution to #2565?

I am still working on organizing the printed parameters in neatly formatted columns so that they are easier to read, but I thought I would check in with you whether you think this solution is robust enough. It has been working fine in my limited testing and I kept the old error message as a fallback to be safe.

close #2565

joelostblom avatar Mar 22 '22 10:03 joelostblom

I rebased on master so that the checks are passing (the many force pushes are for black and flake8, sorry about the noise). I also fixed the formatting of the parameters in the error message. It required more code than expected to make an even table with names sorted alphabetically in columns, but I think that the output is much easier to read now.

Instead of the unformated list of parameters:

SchemaValidationError: Invalid specification

altair.vegalite.v4.schema.core.Axis, validating 'additionalProperties'

altair.Axis has no parameter named 'here'

Existing parameter names are:
'aria', 'bandPosition', 'description', 'domain', 'domainCap', 'domainColor', 'domainDash',
'domainDashOffset', 'domainOpacity', 'domainWidth', 'format',
'formatType', 'grid', 'gridCap', 'gridColor', 'gridDash',
'gridDashOffset', 'gridOpacity', 'gridWidth', 'labelAlign', 'labelAngle',
'labelBaseline', 'labelBound', 'labelColor', 'labelExpr', 'labelFlush',
'labelFlushOffset', 'labelFont', 'labelFontSize', 'labelFontStyle',
'labelFontWeight', 'labelLimit', 'labelLineHeight', 'labelOffset',
'labelOpacity', 'labelOverlap', 'labelPadding', 'labelSeparation',
'labels', 'maxExtent', 'minExtent', 'offset', 'orient', 'position',
'style', 'tickBand', 'tickCap', 'tickColor', 'tickCount', 'tickDash',
'tickDashOffset', 'tickExtra', 'tickMinStep', 'tickOffset', 'tickOpacity',
'tickRound', 'tickSize', 'tickWidth', 'ticks', 'title', 'titleAlign',
'titleAnchor', 'titleAngle', 'titleBaseline', 'titleColor', 'titleFont',
'titleFontSize', 'titleFontStyle', 'titleFontWeight', 'titleLimit',
'titleLineHeight', 'titleOpacity', 'titlePadding', 'titleX', 'titleY',
'translate', 'values', 'zindex'

See the help for altair.Axis to read the full description of these parameters

The output is now a table of parameters:

altair.vegalite.v4.schema.core.Axis, validating 'additionalProperties'

altair.Axis has no parameter named 'here'

Existing parameter names are:
aria               gridDashOffset     labelLineHeight   tickCount        titleBaseline     
bandPosition       gridOpacity        labelOffset       tickDash         titleColor        
description        gridWidth          labelOpacity      tickDashOffset   titleFont         
domain             labelAlign         labelOverlap      tickExtra        titleFontSize     
domainCap          labelAngle         labelPadding      tickMinStep      titleFontStyle    
domainColor        labelBaseline      labelSeparation   tickOffset       titleFontWeight   
domainDash         labelBound         labels            tickOpacity      titleLimit        
domainDashOffset   labelColor         maxExtent         tickRound        titleLineHeight   
domainOpacity      labelExpr          minExtent         tickSize         titleOpacity      
domainWidth        labelFlush         offset            tickWidth        titlePadding      
format             labelFlushOffset   orient            ticks            titleX            
formatType         labelFont          position          title            titleY            
grid               labelFontSize      style             titleAlign       translate         
gridCap            labelFontStyle     tickBand          titleAnchor      values            
gridColor          labelFontWeight    tickCap           titleAngle       zindex            
gridDash           labelLimit         tickColor                                            

See the help for altair.Axis to read the full description of these parameters

I did some basic manual testing with all the encoding channels listed in https://altair-viz.github.io/user_guide/encoding.html#encoding-channels and it seems to work fine. Although I am not sure if there are edge cases, I hope that the fallback to the old method could guard again those, but let me know if you think of anything specific and I can try to add tests for that.

joelostblom avatar Mar 22 '22 20:03 joelostblom

Hi @joelostblom can you synchronise your branch, https://github.com/joelostblom/altair/tree/schema-validation-error, with the main repo and maybe add a code-snippet how to this can be tested easily using eg. colab. This will help the review process. Hopefully @jakevdp is still willing to do this.

mattijn avatar Dec 27 '22 21:12 mattijn

@mattijn I rebased this on the latest main branch and also structured the code so that it is easier to review. Do you think you would be able to have a look at this? I think it could be quite useful to get feedback on in the RC. I have included the rationale and an example below:

When a non-existing parameter name is used, I think it would be helpful to include the existing parameter names in the error message. Currently, when misspelling a parameter name like in the example below, it is not clear whether I made a typo or the parameter doesn't exist at all, so the immediate feedback of seeing the correct parameter names would be helpful.

import altair as alt
import pandas as pd

source = pd.DataFrame({
    'a': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
})

alt.Chart(source).mark_bar().encode(
    x=alt.X("a", scale=alt.Scale(padingOuter=0.5)),
    y=alt.Y("b")
)
SchemaValidationError: Invalid specification

        altair.vegalite.v4.schema.core.Scale, validating 'additionalProperties'

        Additional properties are not allowed ('padingOuter' was unexpected)

I also belive the error message text could be clarified a little. Especially for novices, I don't think it is immediately clear what it means that "Additional properties are not allowed". This PR updates the error message to instead read:

SchemaValidationError: Invalid specification

altair.vegalite.v5.schema.core.Scale, validating 'additionalProperties'

altair.Scale has no parameter named 'padingOuter'

Existing parameter names are:
align      domain      interpolate    range      round    
base       domainMax   nice           rangeMax   scheme   
bins       domainMid   padding        rangeMin   type     
clamp      domainMin   paddingInner   reverse    zero     
constant   exponent    paddingOuter                       

See the help for altair.Scale to read the full description of these parameters

It is possible that we could even remove the two first lines and start with altair.Scale has no... as I don't think the two first lines add any info that helps the user solve the error, what do you think?

This PR also includes a fallback to the old method of displaying error messages, so that in case the assumption in the if-statement does not hold true, things will not break and the user will just see a less helpful error message. Here is an example of when that is happening:

import altair as alt
from vega_datasets import data

source = data.cars()

alt.Chart(source).mark_boxplot(
    ticks={'thickness': 2},
    median={'stroke': 'black', 'strokeWidth': 2},
    size=50
).encode(
    x=alt.X('Miles_per_Gallon:Q', scale=alt.Scale(zero=False)),
    y=alt.Y('Origin', scale=alt.Scale(zero=False)),
    color=alt.Color('Origin'),
)

joelostblom avatar Jan 06 '23 12:01 joelostblom

This looks super helpful! Haven't reviewed the code but just a quick reminder that the changes need to be moved to tools/schemapi/schemapi.py.

binste avatar Jan 07 '23 16:01 binste

Thanks @binste, updated!

joelostblom avatar Jan 08 '23 14:01 joelostblom

Just a note that it might make sense to merge this PR after #2842 and test it again. I'd expect that your changes work in even more cases as the correct additionalProperties error is raised more often, but just to be sure.

binste avatar Jan 20 '23 21:01 binste

Good catch, thank you! I have updated to added the changes to the tools script as well

joelostblom avatar Feb 20 '23 23:02 joelostblom

I think it is a really nice PR. Thanks @joelostblom. Really helpful, nice that it gives suggestions. It makes me want to make mistakes on purpose to read the suggestions..

I found one spec that gives an incorrect hint:

import altair as alt
from vega_datasets import data
cars = data.cars.url

alt.Chart(cars).mark_point().encode(
    x='Acceleration:Q',
    y='Horsepower:Q',
    color=alt.value(1)  # should be eg. alt.value('red')
)
SchemaValidationError: `Color` has no parameter named 'value'

Existing parameter names are:
shorthand      bin         legend   timeUnit   
aggregate      condition   scale    title      
bandPosition   field       sort     type       

See the help for `Color` to read the full description of these parameters

It is allowed to use alt.value() in this context, but in this case it should be a string and not a number.

Also the context description is based on the jsonschema. So I'm still linking this comment https://github.com/altair-viz/altair/pull/2842#issuecomment-1400938604 here as well.

alt.Chart(data.barley()).mark_bar().encode(
    x=alt.X('variety'),
    y=alt.Y('sum(yield)', stack='null'),  # should be eg. stack=None
)
SchemaValidationError: 'null' is an invalid value for `stack`:

'null' is not one of ['zero', 'center', 'normalize']
'null' is not of type 'null'
'null' is not of type 'boolean'

I tried to follow the suggestion, but it was not directly clear that type 'null' actually is None.

No further comments. Congrats again on this PR!

mattijn avatar Feb 21 '23 22:02 mattijn

Thank you @mattijn :smile: I will go ahead and merge this since you approved and the change that @binste suggested has also been implemented.

We can follow up on the remaining two issues separately. I noticed what you mentioned regarding alt.datum/value and it was introduced before this PR so I opened https://github.com/altair-viz/altair/issues/2913 to track it. I agree that it would be helpful to show the Python types as well and opened https://github.com/altair-viz/altair/issues/2914 for that.

joelostblom avatar Feb 21 '23 23:02 joelostblom