souporcell icon indicating copy to clipboard operation
souporcell copied to clipboard

Using partial genotype reference

Open Thapeachydude opened this issue 2 years ago • 2 comments

Hi everyone,

does the tool allow the use of partial genotype references? Say we have 10 patients multiplexed in a single sample, with reference information available for 6 of them. Will it run if I provide the 6 ref. profiles and determine the remaining 4 clusters based only on scRNA-seq data or will this cause an error?

Also does the tool now automatically label the clusters with names provided in the reference? e.g. patient "A" will be cluster "A" and not 0, 1, 2, 3... I saw an older post where this was discussed.

Any insights would be much appreciated! Cheers.

Thapeachydude avatar Jun 30 '22 21:06 Thapeachydude

That is currently not supported sorry.

wheaton5 avatar Jun 30 '22 22:06 wheaton5

Dear @wheaton5, I am running souporcell with --known_genotypes and the --known_genotypes_sample_names. I have 6 samples in each library, hence setting -k 6. However in one particular library, one of the 6 samples is not present in the VCF. If setting - k 6 but providing only 5 --known_genotypes_sample_names I get: AssertionError: length of known genotype sample names should be equal to k/clusters.

Is there any way I can demultiplex this library with 6 samples, but only 5 samples present in the genotype VCF ?

Thanks for developing this fantastic tool!

rubenchazarra avatar Jul 28 '22 09:07 rubenchazarra