pertpy icon indicating copy to clipboard operation
pertpy copied to clipboard

Minor issues in milopy tools

Open emdann opened this issue 2 years ago • 6 comments

Noting down issues to fix and potential improvements to milopy, encountered while writing tutorial (https://github.com/theislab/pertpy-tutorials/pull/4) - issues entirely inherited from my code ofc :)

milo.add_covariate_to_nhoods_var

  • change name to add_covariate_to_nhoods_obs and update docs
  • Check types of added columns (e.g. can go from category to object and then nhood_counts_by_cond complains

pt.pl.milo.nhood_counts_by_cond

  • add informative error for when covariate is missing (directing to use add_covariate_to_nhoods_var)
  • convert strings to categoricals internally (or add informative errors for dtype object)

milo.da_nhoods:

  • at the moment the use of subset_samples and subset_nhoods is buggy, in some edge cases this messes up the SpatialFDR calculation. While the bug is easy enough to catch with diagnostic plots (randomized SpatialFDR) and to fix (https://github.com/emdann/milopy/pull/30), I'm wondering whether these parameters should be dropped completely in the new implementation in pertpy. The use case of subset_samples is essentially covered by specifying contrasts (+ adding a new column in adata.obs if needed to specify groups). Using subset_nhoods is somewhat of a pathological hack (I don't think I've ever used it and I can't even remember the original use case for adding this parameter). In practice if one wants to test on a specific subset of cell phenotypes the best solution is to subset cells and restart from KNN graph building.
  • Add informative errors for missing columns in model matrix (i.e. when the reference level is specified in contrasts)

Happy to open a PR to implement these changes.

emdann avatar Nov 15 '22 15:11 emdann

@emdann I'd be in favor of dropping the subset_samples and subset_nhoods parameters. Do you think that there's a reasonable place and a need somewhere to mention this though? Like maybe in the docs?

Zethson avatar Nov 15 '22 15:11 Zethson

Hopefully the new tutorial covers the use of contrasts to specify comparisons. I could add an example on subsetting to a certain lineage or something.

emdann avatar Nov 15 '22 15:11 emdann

@emdann It would also be great to add plot_milo_diagnostics into milopy. I found this function is pretty helpful and has been used for several times in best-practice notebook.

xinyuejohn avatar Jan 30 '23 14:01 xinyuejohn

@emdann do you think that you'd have time to tackle this soonish?

Zethson avatar Feb 01 '23 10:02 Zethson

In principle I agree with adding the diagnostic plots to the package, although in my head that's more material for extended tutorials since it needs some guidance for interpretation. But up to you guys to decide what fits in the package and what doesn't!

Unfortunately I am going to be slumped in response-to-reviewer work for the next ~ 4 weeks.

emdann avatar Feb 01 '23 12:02 emdann

@emdann let's tackle this at the hackathon, OK?

Zethson avatar Nov 20 '23 13:11 Zethson