scGPT
scGPT copied to clipboard
Cell type labels required for pre-training?
First, thanks a lot to the creators of this tool and the authors helping the users here on Github! Your advice is incredibly useful.
I have a quick question: are cell type labels needed during pre-training? I've tried to read through the code for pretraining in the dev branch, but it's beyond my current capabilities.
Besides cell type annotation, are there other variables that need (or could) be used for pretraining, like sequencing technology? As I understand, these would be considered conditional tokens that the model would learn. Please, correct me where I'm wrong.
Sorry if this has already been asked before, but I could not find a clear statement on this.