celltypist icon indicating copy to clipboard operation
celltypist copied to clipboard

multiple models

Open anke-king opened this issue 3 months ago • 3 comments

I would like to train cell typist on different data sets. Should I merge the 2 data sets and train the model once or train 2 models and do the annotation twice?

anke-king avatar Mar 22 '24 15:03 anke-king

@anke-king, if you train them separately, you will get two independent models. If you want to combine them for training, you have to unify their annotations to make cell type names consistent. Both approaches are feasible (I personally prefer the former as it's quicker and it's intuitive to check the consistency of predictions from two datasets).

ChuanXu1 avatar Mar 25 '24 22:03 ChuanXu1

Thank your for your reply! Just for clarification: I have one data set with cell types for training and a second data set with cell typest which are not in the first data set. In my target data set (which I want to annotate with my custom model) I expect to see cell typest from both data sets. So if I do the former, should I do the annotation twice and select the cell type based on the confidence score or how would I get the consensus annotation?

Thanks!!

anke-king avatar Mar 26 '24 12:03 anke-king

@anke-king, if the cell types in the first and second training datasets are totally different, you can combine them and train a single model. For the confidence scores, they are not comparable across two different models; so if you use two models, you need to inspect separately (celltypist.dotplot will be useful most times), and judge by your knowledge.

ChuanXu1 avatar Mar 30 '24 11:03 ChuanXu1