scCODA
scCODA copied to clipboard
Bug in viz.boxplots function (and possibly others)
Hello,
I think I found a bug in the viz.boxplots
function, and also applied a solution that worked. Explaining it here in case other people ever ran into this problem.
I have a cell counts AnnData object where each row is a subject and each column are the cell counts for that subject.
For each subjects I have several categorical features such as Smoke
that can assume three values and Disease
that assumes two values.
I first plot
viz.boxplots(data_coda, feature_name = "Disease")
This works. Then, I plot
viz.boxplots(data_coda, feature_name = "Smoke")
which also works. However, if I plot the feature "Disease" again, it crashes with the following error:
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
Cell In[73], line 1
----> 1 viz.boxplots(data_coda,'GOLD')
File ~/miniforge3/envs/polvcopd4/lib/python3.9/site-packages/sccoda/util/data_visualization.py:319, in boxplots(data, feature_name, y_scale, plot_facets, add_dots, cell_types, args_boxplot, args_swarmplot, figsize, dpi, cmap, plot_legend, level_order)
315 args_swarmplot["hue_order"] = level_order
317 fig, ax = plt.subplots(figsize=figsize, dpi=dpi)
--> 319 sns.boxplot(x="Cell type", y=value_name, hue=feature_name, data=plot_df, fliersize=1,
320 palette=cmap, ax=ax, **args_boxplot)
322 if add_dots:
323 sns.swarmplot(
324 x="Cell type",
325 y=value_name,
(...)
331 **args_swarmplot
332 )
File ~/miniforge3/envs/polvcopd4/lib/python3.9/site-packages/seaborn/categorical.py:1634, in boxplot(data, x, y, hue, order, hue_order, orient, color, palette, saturation, fill, dodge, width, gap, whis, linecolor, linewidth, fliersize, hue_norm, native_scale, log_scale, formatter, legend, ax, **kwargs)
1627 color = _default_color(
1628 ax.fill_between, hue, color,
1629 {k: v for k, v in kwargs.items() if k in ["c", "color", "fc", "facecolor"]},
1630 saturation=saturation,
1631 )
1632 linecolor = p._complement_color(linecolor, color, p._hue_map)
-> 1634 p.plot_boxes(
1635 width=width,
1636 dodge=dodge,
1637 gap=gap,
1638 fill=fill,
1639 whis=whis,
1640 color=color,
1641 linecolor=linecolor,
1642 linewidth=linewidth,
1643 fliersize=fliersize,
1644 plot_kws=kwargs,
1645 )
1647 p._add_axis_labels(ax)
1648 p._adjust_cat_axis(ax, axis=p.orient)
File ~/miniforge3/envs/polvcopd4/lib/python3.9/site-packages/seaborn/categorical.py:745, in _CategoricalPlotter.plot_boxes(self, width, dodge, gap, fill, whis, color, linecolor, linewidth, fliersize, plot_kws)
742 ax.add_container(BoxPlotContainer(artists))
744 legend_artist = _get_patch_legend_artist(fill)
--> 745 self._configure_legend(ax, legend_artist, boxprops)
UnboundLocalError: local variable 'boxprops' referenced before assignment
Note that this behavior is not easy to reproduce since I think it depends on how many categories each feature has. Moreover, when plotting other features, I found that the legend contained the levels of the previous feature "Smoke". This means that some global variables are persisting between function calls and are leaking in subsequent calls. I think I reconstructed why this happens. In the data_visualization.py
script, the function signature of the boxplots
function is defined as
def boxplots(
data: AnnData,
feature_name: str,
y_scale: str = "relative",
plot_facets: bool = False,
add_dots: bool = False,
cell_types: Optional[list] = None,
args_boxplot: Optional[dict] = {},
args_swarmplot: Optional[dict] = {},
figsize: Optional[Tuple[int, int]] = None,
dpi: Optional[int] = 100,
cmap: Optional[str] = "Blues",
plot_legend: Optional[bool] = True,
level_order: List[str] = None
) -> Optional[Tuple[plt.Subplot, sns.axisgrid.FacetGrid]]:
This function signature defines the default arguments of args_boxplot
and args_swamplot
as empty dicts ({}
). Using mutable default values is generally not recommended in python because it causes unexpected behaviors between function calls (see here).
To avoid this behavior, I just replaced the default values with None
and filled them within the function:
def boxplots(
data: AnnData,
feature_name: str,
y_scale: str = "relative",
plot_facets: bool = False,
add_dots: bool = False,
cell_types: Optional[list] = None,
args_boxplot: Optional[dict] = None,
args_swarmplot: Optional[dict] = None,
figsize: Optional[Tuple[int, int]] = None,
dpi: Optional[int] = 100,
cmap: Optional[str] = "Blues",
plot_legend: Optional[bool] = True,
level_order: List[str] = None
) -> Optional[Tuple[plt.Subplot, sns.axisgrid.FacetGrid]]:
and inside the function we add
if args_swarmplot is None:
args_swarmplot = {}
if args_boxplot is None:
args_boxplot = {}
Not sure if other functions use mutable default values. Thought it could be useful if other people ever ran into this problem.
Thanks for spotting this! I'll keep this fix in mind if we ever release a new version of this package. The actively maintained implementation of scCODA in the pertpy package already contains a very similar fix.
I didn't know about the pertpy package, I'll check it out, thanks for letting me know. Not sure if I should close the issue or leave it open?
I'll leave it open for now, so that I won't forget about this fix. Thanks again!