celltypist icon indicating copy to clipboard operation
celltypist copied to clipboard

downsampling before training the model

Open Flu09 opened this issue 6 months ago • 1 comments

The custom model training is taking more time for me than anticipated. What is the ideal way to down sample the reference while keeping all cell types ? How to do this in python? what is the drawback of downsampling versus using hvgs?

new_model = celltypist.train(ref_adata, labels = 'cell_type_high_resolution', n_jobs = 30, feature_selection = True)
⚠️ Warning: it may take a long time to train this dataset with 2359994 cells and 31629 genes, try to downsample cells and/or restrict genes to a subset (e.g., hvgs)

Flu09 avatar Aug 08 '24 11:08 Flu09