sc-foundation-eval
sc-foundation-eval copied to clipboard
finetune for different dataset
Hi,
I would like to ask a question again. In the code below, you are reducing the number of genes in the newcoming datasets, making it different from 16906. But, isn't it problematic when you want to apply gene2vec positional embedding to this data, since gene2vec is applied by assuming that the input have 16906 genes (columns). I think it does not give an error, however the position vectors of the indexed genes are misleading.
if args.small_geneset: data = preprocess_data_smallgeneset(args.data_path) print("Filtered data to include {} genes present in at least 5% of cells".format(data.shape[1]))