celltypist
celltypist copied to clipboard
downsampling before training the model
The custom model training is taking more time for me than anticipated. What is the ideal way to down sample the reference while keeping all cell types ? How to do this in python? what is the drawback of downsampling versus using hvgs?
new_model = celltypist.train(ref_adata, labels = 'cell_type_high_resolution', n_jobs = 30, feature_selection = True)
⚠️ Warning: it may take a long time to train this dataset with 2359994 cells and 31629 genes, try to downsample cells and/or restrict genes to a subset (e.g., hvgs)