altair
altair copied to clipboard
Large datasets not working in v5.1.2
Following this example from the docs about dealing with large datasets:
import altair as alt
import pandas as pd
data = pd.DataFrame({"x": range(10000)})
alt.data_transformers.disable_max_rows()
alt.Chart(data).mark_point()
This used to work in old versions of altair, but in 5.1.2 in Jupyter it gives this error:
Javascript Error: Cannot read properties of undefined (reading 'shape') This usually means there's a typo in your chart specification. See the javascript console for the full traceback.
Thanks for reporting this! I tried using your code just now in Safari and in Chrome and didn't have a problem. Have you tried opening a new notebook (maybe after clearing the browser cache) and trying again? Are you using Jupyter Lab or Jupyter Notebook?
Ah strange, yes I tried in different browsers. This is jupyter notebook, Python 3.9.13, and the following jupyter core package versions: IPython : 7.31.1 ipykernel : 6.15.2 ipywidgets : 7.6.5 jupyter_client : 7.3.4 jupyter_core : 4.11.1 jupyter_server : 1.18.1 jupyterlab : 3.4.4 nbclient : 0.5.13 nbconvert : 6.4.4 nbformat : 5.5.0 notebook : 6.4.12 qtconsole : 5.2.2 traitlets : 5.1.1
It works for me too. One thing to try is to close down all jupyter notebooks with an altair chart and then reopen jupyter lab and try again.
@joelostblom's suggestion has also worked for me in the past, although I would have thought opening in different browsers would provide something similar...
I compared my version numbers and many of them are slightly higher: ipykernel 6.25.2 ipython 8.15.0 jupyter_client 8.3.1 jupyter_core 5.3.1 jupyterlab 4.0.5 nbclient 0.8.0 nbconvert 7.8.0 nbformat 5.9.2 notebook 7.0.3 traitlets 5.10.0
If it turns out that Altair is not compatible with something in your system, it would be good if we could learn that for future reference!
I just set up a new conda env on a completely separate machine with the following versions: IPython : 8.16.1 ipykernel : 6.25.2 ipywidgets : 8.1.1 jupyter_client : 8.3.1 jupyter_core : 5.3.1 jupyter_server : 2.7.3 jupyterlab : 4.0.6 nbclient : 0.8.0 nbconvert : 7.9.2 nbformat : 5.9.2 notebook : 7.0.4 qtconsole : 5.4.4 traitlets : 5.11.2
altair :5.1.2 pandas :2.1.1 python :3.12.0
and still get the same error as before running the above sample code.
Can you open up the javascript console (F12 in most browsers) and see if there is any additional information there?
Gives the same error as in Jupyter with a bit of a trace:
Uncaught (in promise) Javascript Error: Cannot read properties of undefined (reading 'shape')
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.
Promise.catch (async) displayChart @ VM33:39 execCb @ require.js?v=d37b48b…cc60411154f593:1693 check @ require.js?v=d37b48b…5cc60411154f593:881 enable @ require.js?v=d37b48b…cc60411154f593:1173 init @ require.js?v=d37b48b…5cc60411154f593:786 (anonymous) @ require.js?v=d37b48b…cc60411154f593:1457 setTimeout (async) req.nextTick @ require.js?v=d37b48b…cc60411154f593:1812 localRequire @ require.js?v=d37b48b…cc60411154f593:1446 requirejs @ require.js?v=d37b48b…cc60411154f593:1794 (anonymous) @ VM33:44 (anonymous) @ VM33:52 b @ jquery.min.js:2 Pe @ jquery.min.js:2 append @ jquery.min.js:2 OutputArea._safe_append @ outputarea.js:458 OutputArea.append_execute_result @ outputarea.js:497 OutputArea.append_output @ outputarea.js:325 OutputArea.handle_output @ outputarea.js:256 output @ codecell.js:399 Kernel._handle_output_message @ kernel.js:1199 i @ jquery.min.js:2 Kernel._handle_iopub_message @ kernel.js:1239 Kernel._finish_ws_message @ kernel.js:1018 (anonymous) @ kernel.js:1009 Promise.then (async) Kernel._handle_ws_message @ kernel.js:1009 i @ jquery.min.js:2
If your notebook contains charts in cell outputs from previous Altair versions, then an old version of Vega-Lite might be loaded. Restarting Jupyter lab/notebook or switching browsers might not be enough to resolve this. Could you try in the following order:
- Clear all cell outputs in your notebook
- Restart Jupyter lab/notebook
- Clear your browser cache
- Run the notebook again
It's a new notebook with the above code as literally the only thing in it. I've tried on multiple computers and browsers.
Hmm, I am not sure what is going wrong here. You mentioned that other version of altair worked, if you downgrade it or create a new env with a lower version, does it work again for you? Does other example with disable_max_rows() work? Does it work to use the vegafusion data transformer instead(from the same doc page)?
Hm well actually I thought it worked in a previous version, but to be honest I can't say 100% that I had tested it explicitly like this in a previous version. Is there another example using disable_max_rows() I should try?
You could try it with any of the examples in the gallery both those with data that has less than and more than 5k rows to understand exactly what is failing. The flight datasets exists in different versions with different number of rows https://altair-viz.github.io/gallery/histogram_responsive.html or you could just concatenate any of the datasets together to get over and under 5k
OK maybe I'm completely miscategorizing this issue - maybe it's not a large dataset issue at all, but rather an issue with that sample code. Does the sample code I wrote originally above work for you guys? If I modify it to be something like this, it works fine, but just not in the original definition without encodings:
`import altair as alt import pandas as pd import numpy as np x = np.arange(10000) data = pd.DataFrame({'x': x, 'f(x)': np.sin(x / 5)})
alt.data_transformers.disable_max_rows() alt.Chart(data).mark_line().encode( x='x', y='f(x)' )`
I'm happy it works for you now @bchastain !
The sample code works for me, both when disabling max rows and when using less data without max rows disabled. You should see a single point (or technically many on top of each other) like this: