scatter_matrix formatting considered harmful
ALL software version info
Python : 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ] Operating system : macOS-14.0-arm64-arm-64bit Panel comms : default
holoviews : 1.18.3 bokeh : 3.3.4 colorcet : 3.0.1 dask : 2023.6.0 datashader : 0.16.0 geoviews : 1.11.0 hvplot : 0.9.2 IPython : 8.15.0 jupyterlab : 4.0.11 matplotlib : 3.8.0 notebook : 7.0.6 numba : 0.58.0 numpy : 1.24.3 pandas : 2.1.1 panel : 1.3.1 param : 2.0.1 pillow : 10.0.1 pyarrow : 11.0.0 pyviz_comms : 2.3.0 scipy : 1.11.3 spatialpandas : 0.4.9 xarray : 2023.6.0
Description of expected behavior and the observed behavior
I'd expect scatter_matrix to be formatted reasonably: subplots all lined up, axis labels readable, text not overlapping, and a single Bokeh toolbar for the entire figure. That's not what's happening:
import pandas as pd
import hvplot.pandas
from hvplot import scatter_matrix
url = 'https://raw.githubusercontent.com/shoukewei/data/main/data-pydm/gdp_top_six_economies.csv'
df = pd.read_csv(url)
scatter_matrix(df, alpha=0.5, width=600, height=600, xrotation=0)
scatter_matrix(df, alpha=0.5, width=600, height=600, xrotation=90)
Just adding that the spacing issue seems unrelated to the toolbar:
scatter_matrix(df, alpha=0.5, width=600, height=600, xrotation=0).opts(toolbar='above')
Just adding that the spacing issue seems unrelated to the toolbar:
Yes, I think the spacing issues have been there for some time, while the toolbar issue is relatively recent, but I haven't tried to do a git bisect to pin that down.
Quick feedback:
- Toolbar issue: possibly introduced in https://github.com/holoviz/holoviews/pull/5873
- Subplots not aligned: happened sometime around the Bokeh 3 transition (not sure it's a Bokeh 3 problem either)
https://holoviews.org/reference/containers/bokeh/GridSpace.html
There may be no hvPlot issue at all.
Thanks. I've opened https://github.com/holoviz/holoviews/issues/6126 for the toolbar issue, and @mattpap is looking at it from the Bokeh side.
Bad plot alignment is caused by fixed frame sizing (Plot.frame_{width,height,align}), which works reliably only for single plots and doesn't work well in all other cases (see e.g. issue https://github.com/bokeh/bokeh/issues/13225). I suppose it's time to implement this properly.
From the initial list of issues:
- [ ] subplots all lined up: Mateusz indicated this is a Bokeh issue
- [x] a single Bokeh toolbar for the entire figure: HoloViews issue fixed in https://github.com/holoviz/holoviews/pull/6127
- [ ] axis labels readable, text not overlapping
That leaves us with 3). The default Bokeh formatter is the BasicTickFormatter:
Comparing that to the default of plotly express:
We can get a similar behavior defining a NumericalTickFormatter:
However, it also has its limits:
Certainly, we could better document xformatter/yformatter. But should we also consider defaulting to a more user-friendly formatter?
Defaulting to a more usable formatter sounds like a great idea. @mattpap , any idea why the tick formatter didn't decide to drop the intermediate tick marks? Here I'd be hoping to get one label on the left of the x axis, and one on the right:
any idea why the tick formatter didn't decide to drop the intermediate tick marks?
This is handled setting Axis.major_label_policy = NoOverlap(). When this was implemented the default (AllLabels) was left for backwards compatibility. Tickers and tick formatters have no access to the screen space, so they can't make any adjustments based on the positioning of labels.
Thanks! Ok, @maxime, can you try out the NoOverlap option with NumericalTickFormatter? For hvPlot I strongly favor improving the user experience over preserving previous defaults.