kBET Improve runtime of kBET

kBET is slow and partly because it's running many computations multiple times (for instance, to obtain good stats for the rejection rate).

[ ] ensure that neighbourhoods are computed at most once
[ ] revisit the subsampling implementation
[ ] use a more efficient kNN computation (FNN at the moment)

Apr 29 '22 08:04 mbuttner

Hi there,

First of all thank you for kBET, very useful tool. I am trying to kBET to assess the integration quality of single cell samples (processed with Seurat). First question : in this case, the batch number would be the number of cells in my integrated objects, OR the number of samples (ie. "stimulated", "non stimulated") ? I used the recommanded lines (separating knn computation from kBET function) unfortunaltely the running times are huge. My object is 17k cells x 20k genes ? Would you advise me to randomly subset my data before getting to kBET ?

Thank you in advance for you help,

Best, Lilia

Nov 09 '22 15:11 liliay

Hi @liliay

thank you for trying kBET.

I would use the batch label of the cells, not the condition.
About the runtime: I recommend to reduce the number of initial dimensions. You can compute a PCA on the data and use only the first 50 PCs, or in case you have integrated the data with Seurat, take the embedding space as input. This should be on much lower dimension. Random subsampling might not be necessary.

Best, Maren

Nov 10 '22 15:11 mbuttner

Just wanted to plug our extremely fast python version of kbet

https://github.com/YosefLab/scib-metrics/pull/60

It will be in this package soon. It does not have all the same functionality (no bootstrapping currently), but these things should not be difficult to add.

Dec 10 '22 01:12 adamgayoso

@adamgayoso

Thanks for sharing this! Your code looks quite neat and it is fantastic so learn about the speed-up. Did you also include an estimate on the neighborhood size? I might have missed in the code.

Dec 20 '22 12:12 mbuttner

we did not as it seemed the original scib package used a fixed k

https://github.com/theislab/scib/blob/da9c39b89b95b2ec34b6f547445e931571120ba6/scib/metrics/kbet.py#L144-L151

Dec 20 '22 16:12 adamgayoso

kBET kBET copied to clipboard

Improve runtime of kBET

kBET
kBET copied to clipboard