pertpy
pertpy copied to clipboard
Minor issues in milopy tools
Noting down issues to fix and potential improvements to milopy, encountered while writing tutorial (https://github.com/theislab/pertpy-tutorials/pull/4) - issues entirely inherited from my code ofc :)
milo.add_covariate_to_nhoods_var
- change name to
add_covariate_to_nhoods_obsand update docs - Check types of added columns (e.g. can go from
categorytoobjectand thennhood_counts_by_condcomplains
pt.pl.milo.nhood_counts_by_cond
- add informative error for when covariate is missing (directing to use
add_covariate_to_nhoods_var) - convert strings to categoricals internally (or add informative errors for dtype object)
milo.da_nhoods:
- at the moment the use of
subset_samplesandsubset_nhoodsis buggy, in some edge cases this messes up the SpatialFDR calculation. While the bug is easy enough to catch with diagnostic plots (randomized SpatialFDR) and to fix (https://github.com/emdann/milopy/pull/30), I'm wondering whether these parameters should be dropped completely in the new implementation in pertpy. The use case ofsubset_samplesis essentially covered by specifying contrasts (+ adding a new column in adata.obs if needed to specify groups). Usingsubset_nhoodsis somewhat of a pathological hack (I don't think I've ever used it and I can't even remember the original use case for adding this parameter). In practice if one wants to test on a specific subset of cell phenotypes the best solution is to subset cells and restart from KNN graph building. - Add informative errors for missing columns in model matrix (i.e. when the reference level is specified in contrasts)
Happy to open a PR to implement these changes.
@emdann I'd be in favor of dropping the subset_samples and subset_nhoods parameters. Do you think that there's a reasonable place and a need somewhere to mention this though? Like maybe in the docs?
Hopefully the new tutorial covers the use of contrasts to specify comparisons. I could add an example on subsetting to a certain lineage or something.
@emdann It would also be great to add plot_milo_diagnostics into milopy. I found this function is pretty helpful and has been used for several times in best-practice notebook.
@emdann do you think that you'd have time to tackle this soonish?
In principle I agree with adding the diagnostic plots to the package, although in my head that's more material for extended tutorials since it needs some guidance for interpretation. But up to you guys to decide what fits in the package and what doesn't!
Unfortunately I am going to be slumped in response-to-reviewer work for the next ~ 4 weeks.
@emdann let's tackle this at the hackathon, OK?