scarches
scarches copied to clipboard
Suggested improvements for label transfer function
Hi @M0hammadL @alextopalova ,
Here are some of my comments on/suggestions for your label transfer code:
- Could it be that this code is not yet part of the pip-installable scArches?
- Is there any reason why you chose to separate
sca.utils.knn.weighted_knn_trainer
fromsca.utils.knn.weighted_knn_transfer
? I think I would just make them into one function - What is the
package
argument for? I would remove it - About the
label_keys
argument: I would make that either into an actual list of obs.columns (now the function looks for columns starting with whatever string you set it to), or make it into just a single label (i.e. obs column). Don't see the logic of the current setting (it was used in the context of HLCA based label transfer but doesn't generalise well) - I would make the function print some of the relevant information, e.g.:
- chosen k
- when it is calculating the neighbour graph, with a note that this can take very long
- which obs columns it's transferring when ("transferring labels for..") etc.
- I think it would be great if you could feed it an existing neighbour graph, as it takes quite long to calculate a ken-graph on such a large dataset, and we would need it anyway for generating a UMAP etc., it is very computationally in-efficient at the moment.
Let me know what you think!
I would also write it in such a way that it updates the anndata object rather than outputting pandas dataframes!