contrastiveVI
contrastiveVI copied to clipboard
Dealing mit with multiple conditions and details on model parameters
Good morning,
I'd like to try your tool, I don't have much experience with scVI and mainly work with the single-cell experiment class in Bioconductor, so it would be grea to have your insights on how to best run it.
1). Multiple conditions: The data set I have is not unlike the MIX-Seq one, but I'm dealing with multiple conditions, not two. Would you run them all together or rather separately pairwise (control vs treatment 1, then control vs treatment 2 etc)?
2). Raw or normalized counts: Does the tool require raw or normalized counts? (assuming raw, but wanted to make sure).
3). Model parameters:
How do parameters such as the number of latent layers
, n_hidden
and batch size
affect the output? I tried running my dataset using the parameters below (using all conditions vs dmso). The model finished, but looking at the salient UMAP it was one big blob.
## Initialize raw counts
adata.raw = adata
adata.layers["counts"] = adata.X.toarray() # keep raw counts for scdef
model = ContrastiveVI(
adata,
n_batch = 0, # no batch correction
n_layers = 1,
n_hidden = 128,
n_salient_latent=10,
n_background_latent=10,
use_observed_lib_size=False
)
background_indices = np.where(adata.obs["Treatment"] == "DMSO")[0]
target_indices = np.where(adata.obs["Treatment"] != "DMSO")[0]
model.train(
check_val_every_n_epoch = 1,
train_size = 0.8, # 0 to 1
background_indices = background_indices,
target_indices = target_indices,
use_gpu = True,
early_stopping = True,
max_epochs = 1000,
batch_size = 512
)
## Convert things back into R to continue with SCE class
reducedDim(sce, "salient") <- scd$obsm[["salient_rep"]] # assign as dimensional reduction to sce
set.seed(100)
sce <- runUMAP(sce, dimred = "salient", n_neighbors = 5, name = paste0("UMAP_salient"), BPPARAM = mcparam) # assuming euclidean distance works as a metric
Many thanks : )