scanpy
scanpy copied to clipboard
`regress_out` does not capture PerfectSeparation with statsmodels >= 0.14
Please make sure these conditions are met
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of scanpy.
- [X] (optional) I have confirmed this bug exists on the main branch of scanpy.
What happened?
Hi,
since statsmodels 0.14, perfect separation no longer raises an error but a warning (see function doc here). Because scanpy currently only catches the now-outdated error (instead of catch the warning), users may see many warnings from regress_out
when no perfect separation exists (see usage in scanpy here). It seems to follow on the heels of this issue in statsmodels. I propose to implement that the warning is caught just as the errors were being caught.
Cheers,
Jesko
Minimal code sample
import anndata as ad
import scanpy as sc
import numpy as np
import pandas as pd
adata = ad.AnnData(np.array([[0,0,1,1]]).T, obs=pd.DataFrame({"a":[0,0,1,1]}))
sc.pp.regress_out(adata, "a")
Error output
.../statsmodels/genmod/generalized_linear_model.py:1257: PerfectSeparationWarning: Perfect separation or prediction detected, parameter may not be identified
warnings.warn(msg, category=PerfectSeparationWarning)
Versions
anndata 0.10.4
scanpy 1.9.6
statsmodels 0.14.0