`normalize_total` with numba
What kind of feature would you like to request?
Other?
Please describe your wishes
This would speed up normalization
Waiting for potential work already done on this by intel folks (@ashish615 have you guys worked on this?)
@Intron7 code for below link may work for only csr matrix.
https://github.com/IntelLabs/Open-Omics-Acceleration-Framework/blob/main/pipelines/single-cell-RNA-seq-analysis/notebooks/fastpp.py#L499-L522
Implement with or without Intel PR
So the implementation from intel would replace the axis_mul_or_truediv function we use as last step here:
https://github.com/scverse/scanpy/blob/834159ae1e938a29b3e98b89366c62a40bbe1966/src/scanpy/preprocessing/_normalization.py#L28-L52
Because everything done until that point is to find median in case there is no fixed target sum given. So I would essentially have to utilize numba in axis_mul_or_truediv, does that sound right @ilan-gold @flying-sheep ? If so I would proceed on that
@selmanozleyen That does sound correct, although I have not looked into it closely before just now, and there is not much of a bread crumb trail to pick at since the Intel code is so different. So I don't want to say "yes for sure" but it does seem that way. It's possible axis_sum could be another place to use numba, I'm really not sure. BTW a good place for something like "fast axis multiplication" or "axis sum" could be https://github.com/scverse/fast-array-utils (and in fact maybe should be there). But prototyping here and getting the actual numba function ready would make sense. Maybe @Intron7 wants to add something as to which part of normalize_total would be "numba-able" if not those mentioned?
I removed the axis from the name, since extending the functions to allow axis=None wasn’t that hard, and yes, sum is there.
See here for my thoughts about the scope of fast-array-utils: https://github.com/scverse/scanpy/issues/3449