pertpy Solve TODOs and verify DIALOGUE 1

trafficstars

Description of changes

I have benchmarked the initial part of DIALOGUE (DIALOGUE1). I changed the _pseudobulk_feature_space function so that the user can choose the aggregation method (median or mean) and the output now has samples as rows, matching the R implementation. I also modified the _scale_data function to center, scale, and cap extreme values (with a cap of 0.01) in a way that mirrors the R functions center.matrix and cap.mat. In addition, I updated the _load function to optionally restrict the data to common samples across cell types. The output of _load is now a dataframe that is converted back to a numpy array before further processing.

Technical details

The changes make the pseudobulk and normalization steps in Python produce results that match the R version. I added an optional parameter to subset to common samples and to choose the averaging function. I also ensure that the data are converted to numpy arrays before passing them to the penalized matrix decomposition functions.

Additional context

These changes only affect the initial part of DIALOGUE (DIALOGUE1) and do not modify downstream analysis.

Feb 21 '25 17:02 grpinto

Codecov Report

Attention: Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 65.79%. Comparing base (6a97036) to head (c27ffed).

Files with missing lines	Patch %	Lines
pertpy/tools/_dialogue.py	92.00%	2 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #714      +/-   ##
==========================================
+ Coverage   63.17%   65.79%   +2.61%     
==========================================
  Files          47       47              
  Lines        6110     6127      +17     
==========================================
+ Hits         3860     4031     +171     
+ Misses       2250     2096     -154

Files with missing lines	Coverage Δ
pertpy/tools/_dialogue.py	`38.11% <92.00%> (+24.16%)`	:arrow_up:

... and 3 files with indirect coverage changes

Feb 22 '25 09:02 codecov-commenter

So, I also have a notebook that has benchmarked the current implementation of DIALOGUE against the R in their toy example, the results look good but I have a lot of datafiles and stuff that might need some adjustment, I need to speak to @Zethson but my other PR has an operational version, very scrappy tho, for now but enough for the figures I think, in a week or so I will get back to this. Now I have to focus on Pfizer.

Thank you for all your comments Yuge :)

Mar 19 '25 17:03 grpinto

pertpy pertpy copied to clipboard

Solve TODOs and verify DIALOGUE 1

Codecov Report

pertpy
pertpy copied to clipboard