openproblems-v2
                                
                                 openproblems-v2 copied to clipboard
                                
                                    openproblems-v2 copied to clipboard
                            
                            
                            
                        [batch_int] Revamp control methods
The OpenProblems v1 repo has the following control methods:
- No integration (Simply return PCA embedding)
- Random integration:
- Permute features
- Permute graph
- Permute PCA embedding
 
- Permute per celltype
- Permute features within cell types
- Permute graph within cell types
- Permute embedding within cell types
 
- Permute per batch
- Same
 
Defining negative control methods is easy -- simply permuting the features, graph or PCA embedding should do the trick. To define the positive control, we'll need to take a look at how to define positive controls to target specific metrics.
This was already done in v1 -- we should look for the discussion on how this was tackled in the v1.
Going off what we have on the v1 website, these are all batch integration baselines:
Out of these, we're currently missing:
- No integration
- Random integration by cell type
- Random integration by batch
- Random graph by cell type
From what I can see the control methods are not organised in the same hierarchy that you listed here @rcannood, but I think it might be useful to do that, so we don't get confused with the naming. This will require reorganising the v1 code as well as the number of the control methods, unless we come up with a way to compute a method and process the different outputs (feature, embedding, graph) differently based on 1 outcome, complicating things.