pertpy icon indicating copy to clipboard operation
pertpy copied to clipboard

MUSIC-style topic modeling

Open JohnGoertz opened this issue 3 months ago • 2 comments

Description of feature

Hello! This is a really helpful package, thank you for putting it together! I've been working on porting the MUSIC (Model-based Understanding of SIngle-cell CRISPR screening) R package to Python, using scvi for imputation and scikit-learn for topic modeling. It seems like it would be a natural fit with pertpy, would you be open to a pull request? Anything I should keep in mind for implementation beyond what's in the scanpy and pertpy contribution guides?

JohnGoertz avatar Sep 30 '25 15:09 JohnGoertz

Dear @JohnGoertz ,

thank you for reaching out! This is great. I am interested in having this implementation in pertpy. A few things to consider beyond the contributing guide:

  1. The pipeline has some preprocessing and some more tool like things. Analogously to say mixscape, I suggest that we make it a tool and put all functionality into a pt.tl.Music object.
  2. Just recently, I've made scvi-tools an optional dependency. Ideally we can keep it that way. Depending on how we implement it, we should be able to lazily import scvi and users will need to install it if they want to use Music.
  3. We should have a tutorial in https://github.com/scverse/pertpy-tutorials that we'll also render then.
  4. I would be very happy if we could have a quick comparison between the R and Python implementation in https://github.com/theislab/pertpy-reproducibility/tree/main/benchmark. They don't have to overlap perfectly but should be very close.

I am very happy to help guide your implementation to ensure that it fits our design and is ideally a smooth process for you. Please feel free to ping me whenever you have questions or would like to have intermediate feedback.

Thanks!

Zethson avatar Sep 30 '25 16:09 Zethson

Great, makes sense! I'll get to work on that, using the Mixscape tool as a model. SCVI can definitely be lazy load, it can even be extra-lazy within just the imputation function. Imputation is required for topic modeling, but it helps.

JohnGoertz avatar Oct 03 '25 18:10 JohnGoertz