Enable GPU
Hey Rajesh, thanks for the pull request! This looks interesting.
Have you tested that it works as expected? Do you get similar results with GPU set to True or False? Could you share any benchmarking results that might help users understand the expected benefit from using GPU? Thanks again!
Sorry, I have not yet completely tested. I should have marked the MR as Draft. I will get back to you soon.
It is still failing a test.
@slowkow please find a unit test for perf committed(marked skip). While testing with data/pbmc_3500_pcs.tsv.gz the difference is 3 vs 5 secs.
While testing with larger datasets the difference is very noticeable. When the input size is (35922, 25101) it is 10sec vs 73sec.
The results are very similar. The difference starts appearing after 2nd decimal place. Please find attached the first 100 lines from the following code.
harmonized = hm.run_harmony(adata.obsm['X_pca'], adata.obs, 'old.ident')
df_harmonized = pd.DataFrame(harmonized.Z_corr)
df_harmonized.columns = adata.obs_names
adata.obsm['X_harmony'] = df_harmonized.T
harmonized_cpu.csv harmonized_gpu.csv
Please also find the statistical differences below:
Note: An extremely fast and GPU accelerated version of harmony is now available https://rapids-singlecell.readthedocs.io/en/latest/api/generated/rapids_singlecell.pp.harmony_integrate.html#rapids_singlecell.pp.harmony_integrate in Rapids-singlecell.
CC @Intron7