openproblems-v2 icon indicating copy to clipboard operation
openproblems-v2 copied to clipboard

[batch_int] Revamp control methods

Open rcannood opened this issue 1 year ago • 3 comments

The OpenProblems v1 repo has the following control methods:

  • No integration (Simply return PCA embedding)
  • Random integration:
    • Permute features
    • Permute graph
    • Permute PCA embedding
  • Permute per celltype
    • Permute features within cell types
    • Permute graph within cell types
    • Permute embedding within cell types
  • Permute per batch
    • Same

rcannood avatar Nov 16 '23 15:11 rcannood

Defining negative control methods is easy -- simply permuting the features, graph or PCA embedding should do the trick. To define the positive control, we'll need to take a look at how to define positive controls to target specific metrics.

This was already done in v1 -- we should look for the discussion on how this was tackled in the v1.

rcannood avatar Nov 22 '23 12:11 rcannood

Going off what we have on the v1 website, these are all batch integration baselines:

image

Out of these, we're currently missing:

  • No integration
  • Random integration by cell type
  • Random integration by batch
  • Random graph by cell type

mumichae avatar Apr 19 '24 11:04 mumichae

From what I can see the control methods are not organised in the same hierarchy that you listed here @rcannood, but I think it might be useful to do that, so we don't get confused with the naming. This will require reorganising the v1 code as well as the number of the control methods, unless we come up with a way to compute a method and process the different outputs (feature, embedding, graph) differently based on 1 outcome, complicating things.

mumichae avatar Apr 19 '24 11:04 mumichae