scGPT icon indicating copy to clipboard operation
scGPT copied to clipboard

Cell type labels required for pre-training?

Open xoelmb opened this issue 8 months ago • 0 comments

First, thanks a lot to the creators of this tool and the authors helping the users here on Github! Your advice is incredibly useful. I have a quick question: are cell type labels needed during pre-training? I've tried to read through the code for pretraining in the dev branch, but it's beyond my current capabilities.
Besides cell type annotation, are there other variables that need (or could) be used for pretraining, like sequencing technology? As I understand, these would be considered conditional tokens that the model would learn. Please, correct me where I'm wrong.

Sorry if this has already been asked before, but I could not find a clear statement on this.

xoelmb avatar Jun 26 '24 17:06 xoelmb