altair icon indicating copy to clipboard operation
altair copied to clipboard

Is it possible to compile from vega/vega-lite spec to Altair Code?

Open ObservedObserver opened this issue 2 years ago • 3 comments

Thinking about exporting pygwalker visualizations to Altair code in Python.

ObservedObserver avatar Sep 29 '23 20:09 ObservedObserver

Thanks for asking @ObservedObserver! I don't think this is possible unfortunately.

mattijn avatar Sep 30 '23 05:09 mattijn

See also https://github.com/altair-viz/altair/issues/913#issuecomment-393713890

mattijn avatar Sep 30 '23 05:09 mattijn

Taking the example in Altair Internals:

import altair as alt
from vega_datasets import data

chart = alt.Chart(data.cars.url).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color='Origin:N',
).configure_view(
    continuousHeight=300,
    continuousWidth=300,
)

gives the following Vega-Lite spec:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.14.1.json",
  "config": {
    "view": {
      "continuousHeight": 300,
      "continuousWidth": 300
    }
  },
  "data": {
    "url": "https://cdn.jsdelivr.net/npm/[email protected]/data/cars.json"
  },
  "encoding": {
    "color": {
      "field": "Origin",
      "type": "nominal"
    },
    "x": {
      "field": "Horsepower",
      "type": "quantitative"
    },
    "y": {
      "field": "Miles_per_Gallon",
      "type": "quantitative"
    }
  },
  "mark": {
    "type": "point"
  }
}

I think the first step, if you want to get something like this done in pygwalker, might be to map the VL spec back to the low-level Altair classes. It's probably doable as the mapping from VL spec to Altair classes is already implemented in Chart.from_dict. As mentioned by Jake, this will result in some pretty unwieldy code. If we read back the VL spec above and look at the encodings we get:

alt.Chart.from_dict(spec).encoding
FacetedEncoding({
  color: FieldOrDatumDefWithConditionMarkPropFieldDefGradientstringnull({
    field: FieldName('Origin'),
    type: StandardType('nominal')
  }),
  x: PositionFieldDef({
    field: FieldName('Horsepower'),
    type: StandardType('quantitative')
  }),
  y: PositionFieldDef({
    field: FieldName('Miles_per_Gallon'),
    type: StandardType('quantitative')
  })
})

The main challenge is then to map this to the high-level Altair API that users would expect, see the Altair code in the beginning. Maybe pygwalker could start out with producing the "low-level" Altair code and then implement rules on top to produce more readable Altair code where possible. Whenever no rule is implemented or a spec gets too complex, it could fall back to the low-level Altair classes. This can then be improved over time. Some of these rules, such as simplifying FieldName('Miles_per_Gallon') to simply the column name, could be automatically inferred from the Vega-Lite schema.

I think it's a great idea for tools such as pygwalker to have this code generation capability, I have looked at using pygwalker in the past but didn't go for it as I was missing exactly this. Come to think of it, it would of course be great if it's a standalone package so that other tools might profit from it as well :)

I hope this helps! If you give it a try, feel free to ping me in case you have any questions about the mapping of Altair code <-> VL schema and I'm happy to have a look if time permits.

binste avatar Oct 01 '23 07:10 binste