vireo icon indicating copy to clipboard operation
vireo copied to clipboard

all unassigned for 10x single cell RNAseq dataset using 1000genome vcf

Open songeric1107 opened this issue 3 years ago • 3 comments

Hi Huang, Thanks for reading my previous issue. I had another problem. I am trying to apply vireo with scRNAseq datasets based on 1000 genome vcf file. However, all the cells are classified as unassigned . is that due to resolution of vcf?

cellsnp-lite -s unassigned_alignments.bam -b barcodes.tsv.gz -O sc1_all_small_min5 -R ../../ref/genome1K.phase3.SNP_AF5e2.chr1toX.hg38.yang.vcf -p 20 --minMAF 0.1 --minCOUNT 5 --UMItag Auto --gzip

CELL_DIR=cellsnp-lit_analysis/new_analysis/sc1_all_small_min5/

OUT_DIR=vireo_analysis/sc1_output_min5/ vireo -c $CELL_DIR -N 4 -o $OUT_DIR

the log file:

Welcome to vireoSNP v0.5.6!

use -h or --help for help on argument. [vireo] Loading cell folder ... [vireo] Demultiplex 676216 cells to 4 donors with 37 variants. [vireo] lower bound ranges [-907.9, -897.7, -892.9] [vireo] allelic rate mean and concentrations: [[0.138 0.529 0.98 ]] [[ 264.8 1050.6 224.6]] [vireo] donor size before removing doublets: donor0 donor1 donor2 donor3 169054 169056 169050 169055 [vireo] final donor size: unassigned 676216 [vireo] All done: 5 min 59.7 sec

songeric1107 avatar Jan 18 '22 14:01 songeric1107

Hi, from the log, it seems there are only 37 variants used to demultiplex 67K cells. You may check if the chromosome ids have different patterns, e.g, w/ or w/o "chr" between your bam file and your *.yang.vcf?

Yuanhua

huangyh09 avatar Jan 21 '22 01:01 huangyh09

no, the format is the same.

I change to the larger database, I am able to get the donor list. Var1 Freq donor0 4238 donor1 4148 donor2 3904 donor3 4216 doublet 234 unassigned 659476

however, the barcodes for each predicted donor could not be validated. I have the barcode from 3 donors, no overlap to any predicted donor barcodes

songeric1107 avatar Jan 25 '22 01:01 songeric1107

Have you resolved the issue with the low number of variants? I don't think 37 variants are likely informative enough to demultiplex a large number of cells well.

Yuanhua

huangyh09 avatar Feb 02 '22 01:02 huangyh09