scarches icon indicating copy to clipboard operation
scarches copied to clipboard

Suggested improvements for label transfer function

Open LisaSikkema opened this issue 2 years ago • 1 comments

Hi @M0hammadL @alextopalova ,

Here are some of my comments on/suggestions for your label transfer code:

  1. Could it be that this code is not yet part of the pip-installable scArches?
  2. Is there any reason why you chose to separate sca.utils.knn.weighted_knn_trainer from sca.utils.knn.weighted_knn_transfer? I think I would just make them into one function
  3. What is the package argument for? I would remove it
  4. About the label_keys argument: I would make that either into an actual list of obs.columns (now the function looks for columns starting with whatever string you set it to), or make it into just a single label (i.e. obs column). Don't see the logic of the current setting (it was used in the context of HLCA based label transfer but doesn't generalise well)
  5. I would make the function print some of the relevant information, e.g.:
  • chosen k
  • when it is calculating the neighbour graph, with a note that this can take very long
  • which obs columns it's transferring when ("transferring labels for..") etc.
  1. I think it would be great if you could feed it an existing neighbour graph, as it takes quite long to calculate a ken-graph on such a large dataset, and we would need it anyway for generating a UMAP etc., it is very computationally in-efficient at the moment.

Let me know what you think!

LisaSikkema avatar Feb 07 '23 13:02 LisaSikkema

I would also write it in such a way that it updates the anndata object rather than outputting pandas dataframes!

LisaSikkema avatar Feb 14 '23 11:02 LisaSikkema