seaborn
seaborn copied to clipboard
`FacetGrid.map_dataframe` passes disallowed keyword arguments to `pointplot`
My code that used to work perfectly fine with 0.11.0 breaks with the new 0.12.0 release. The code creates a FacetGrid and then applies pointplot()
to each of the cells as follows:
...
g = sns.FacetGrid(df_fs, row="metric", col="learner_name",
hue="variable", height=2.5, aspect=1,
margin_titles=True, despine=True, sharex=False,
sharey=False, legend_out=False, palette="Set1")
g = g.map_dataframe(sns.pointplot, "training_set_size", "value",
scale=.5, ci=None)
...
The relevant traceback is as follows:
Traceback (most recent call last):
...
File "/builds/EducationalTestingService/skll/skll/experiments/output.py", line 127, in generate_learning_curve_plots
g = g.map_dataframe(sns.pointplot, "training_set_size", "value",
File "/root/sklldev/lib/python3.8/site-packages/seaborn/axisgrid.py", line 819, in map_dataframe
self._facet_plot(func, ax, args, kwargs)
File "/root/sklldev/lib/python3.8/site-packages/seaborn/axisgrid.py", line 848, in _facet_plot
func(*plot_args, **plot_kwargs)
TypeError: pointplot() got an unexpected keyword argument 'label'
Looking at the code for FaceGrid.map_dataframe
, it does indeed create a label
keyword argument which, I guess, causes the failure when pointplot()
is called. From reading the release notes, it looks like this is because of this item
Removed the (previously-unused) option to pass additional keyword arguments to pointplot()
That keyword argument is created if hue
is specified which I am not sure how to get around since I have multiple variables that I want to represent with different colors. If there's another way to achieve this, I'd really appreciate any guidance.
I tried modifying the code to move the hue
variable from the FacetGrid()
instantiation call to the map_dataframe()
call instead:
...
g = sns.FacetGrid(df_fs, row="metric", col="learner_name", height=2.5,
aspect=1, margin_titles=True, despine=True, sharex=False,
sharey=False, legend_out=False, palette="Set1")
g = g.map_dataframe(sns.pointplot, x="training_set_size", y="value",
hue="variable", scale=.5, errorbar=None)
...
While this code does work, it does not produce accurate results. Here's the plot with the above code with v0.12.0:
For comparison, here's the plot as produced by the original code with v0.11.2:
I figured out how to make this work by following the recommendations that:
- hue levels and keywords should be handled by the plotting function and not
FacetGrid
- we need to make sure that the variable in the data frame that maps to hue levels is categorical.
- it is now recommended to explicitly assign palette colors to hue levels.
Here's the new code:
...
df_fs["variable"] = df_fs["variable"].astype("category")
g = sns.FacetGrid(df_fs, row="metric", col="learner_name",
height=2.5, aspect=1, margin_titles=True,
despine=True, sharex=False,
sharey=False, legend_out=False)
g = g.map_dataframe(sns.pointplot, x="training_set_size",
y="value", hue="variable", scale=.5,
errorbar=None,
palette={"train_score_mean": train_color,
"test_score_mean": test_color})
...
This code now produces the same (correct) plot as with v0.11.2.
We should unbreak this, even if it's discouraged usage.
Glad you were able to work out the right thing to do here, but I am a little curious why you didn't opt for catplot
, which would do all this complicated bookkeeping for you.
Indeed, catplot
would have been simpler and I did try it but the marker size seemed to be much larger and less to my liking than if I used FacetGrid
.
Do you have an example? Catplot should basically just be generating the code in your third post.
Sure! Attached are two plots that are saved in 300 DPI using plt.savefig()
. The first was generated using FacetGrid + pointplot and the second was generated using catplot. I am doing a bunch of matplotlib-level processing to add the plot titles and legend manually but that part is identical between the two scenarios.
Thanks but I’d need to see the actual code to make sense of the example.
Ah, sorry. Please take a look at the generate_learning_curve_plots()
function here.
That link 404s (is it a private repo?)
I can't reproduce whatever you might be seeing with a simple example though:
(The tips
example dataframe loads with categorical dtypes so it simplifies the bookkeeping when using FacetGrid
).
Apologies, that branch was probably merged by the time you got to it. It's now in the main branch.
I don't see any use of catplot
on that page?
Yeah, as I mentioned, I didn't use catplot
in production because of the marker size.
Here's a gist that shows how I combined the FacetGrid
and pointplot
calls together.
However, I am extremely embarrassed to say that it now works fine 😬! Looking back on it, probably because when I did the test originally, I forgot to include the scale=0.5
keyword in the catplot
call.
Apologies for wasting your time on this secondary issue.