seaborn `FacetGrid.map_dataframe` passes disallowed keyword arguments to `pointplot`

My code that used to work perfectly fine with 0.11.0 breaks with the new 0.12.0 release. The code creates a FacetGrid and then applies pointplot() to each of the cells as follows:

...
g = sns.FacetGrid(df_fs, row="metric", col="learner_name",
                  hue="variable", height=2.5, aspect=1,
                  margin_titles=True, despine=True, sharex=False,
                  sharey=False, legend_out=False, palette="Set1")
g = g.map_dataframe(sns.pointplot, "training_set_size", "value",
                    scale=.5, ci=None)
...

The relevant traceback is as follows:

Traceback (most recent call last):
 ...
  File "/builds/EducationalTestingService/skll/skll/experiments/output.py", line 127, in generate_learning_curve_plots
    g = g.map_dataframe(sns.pointplot, "training_set_size", "value",
  File "/root/sklldev/lib/python3.8/site-packages/seaborn/axisgrid.py", line 819, in map_dataframe
    self._facet_plot(func, ax, args, kwargs)
  File "/root/sklldev/lib/python3.8/site-packages/seaborn/axisgrid.py", line 848, in _facet_plot
    func(*plot_args, **plot_kwargs)
TypeError: pointplot() got an unexpected keyword argument 'label'

Looking at the code for FaceGrid.map_dataframe, it does indeed create a label keyword argument which, I guess, causes the failure when pointplot() is called. From reading the release notes, it looks like this is because of this item

Removed the (previously-unused) option to pass additional keyword arguments to pointplot()

That keyword argument is created if hue is specified which I am not sure how to get around since I have multiple variables that I want to represent with different colors. If there's another way to achieve this, I'd really appreciate any guidance.

Sep 07 '22 16:09 desilinguist

I tried modifying the code to move the hue variable from the FacetGrid() instantiation call to the map_dataframe() call instead:

...
g = sns.FacetGrid(df_fs, row="metric", col="learner_name", height=2.5, 
                               aspect=1, margin_titles=True, despine=True, sharex=False, 
                               sharey=False, legend_out=False, palette="Set1")
g = g.map_dataframe(sns.pointplot, x="training_set_size", y="value",
                    hue="variable", scale=.5, errorbar=None)
...

While this code does work, it does not produce accurate results. Here's the plot with the above code with v0.12.0:

For comparison, here's the plot as produced by the original code with v0.11.2:

Sep 07 '22 18:09 desilinguist

I figured out how to make this work by following the recommendations that:

hue levels and keywords should be handled by the plotting function and not FacetGrid
we need to make sure that the variable in the data frame that maps to hue levels is categorical.
it is now recommended to explicitly assign palette colors to hue levels.

Here's the new code:

...
df_fs["variable"] = df_fs["variable"].astype("category")
g = sns.FacetGrid(df_fs, row="metric", col="learner_name",
                  height=2.5, aspect=1, margin_titles=True,
                  despine=True, sharex=False,
                  sharey=False, legend_out=False)
g = g.map_dataframe(sns.pointplot, x="training_set_size",
                    y="value", hue="variable", scale=.5,
                    errorbar=None,
                    palette={"train_score_mean": train_color,
                             "test_score_mean": test_color})
...

This code now produces the same (correct) plot as with v0.11.2.

Sep 07 '22 20:09 desilinguist

We should unbreak this, even if it's discouraged usage.

Glad you were able to work out the right thing to do here, but I am a little curious why you didn't opt for catplot, which would do all this complicated bookkeeping for you.

Sep 07 '22 22:09 mwaskom

Indeed, catplot would have been simpler and I did try it but the marker size seemed to be much larger and less to my liking than if I used FacetGrid.

Sep 08 '22 01:09 desilinguist

Do you have an example? Catplot should basically just be generating the code in your third post.

Sep 08 '22 02:09 mwaskom

Sure! Attached are two plots that are saved in 300 DPI using plt.savefig(). The first was generated using FacetGrid + pointplot and the second was generated using catplot. I am doing a bunch of matplotlib-level processing to add the plot titles and legend manually but that part is identical between the two scenarios.

factgrid+pointplot

catplot

Sep 08 '22 13:09 desilinguist

Thanks but I’d need to see the actual code to make sense of the example.

Sep 08 '22 14:09 mwaskom

Ah, sorry. Please take a look at the generate_learning_curve_plots() function here.

Sep 08 '22 14:09 desilinguist

That link 404s (is it a private repo?)

I can't reproduce whatever you might be seeing with a simple example though:

(The tips example dataframe loads with categorical dtypes so it simplifies the bookkeeping when using FacetGrid).

Sep 08 '22 22:09 mwaskom

Apologies, that branch was probably merged by the time you got to it. It's now in the main branch.

Sep 08 '22 22:09 desilinguist

I don't see any use of catplot on that page?

Sep 08 '22 23:09 mwaskom

Yeah, as I mentioned, I didn't use catplot in production because of the marker size.

Here's a gist that shows how I combined the FacetGrid and pointplot calls together.

However, I am extremely embarrassed to say that it now works fine 😬! Looking back on it, probably because when I did the test originally, I forgot to include the scale=0.5 keyword in the catplot call.

Apologies for wasting your time on this secondary issue.

Sep 08 '22 23:09 desilinguist

seaborn seaborn copied to clipboard

`FacetGrid.map_dataframe` passes disallowed keyword arguments to `pointplot`

seaborn
seaborn copied to clipboard