deepvariant icon indicating copy to clipboard operation
deepvariant copied to clipboard

Maintain Barcode in Output

Open gneedle1 opened this issue 1 year ago • 4 comments

When running deepvariant on single cell Pacbio longread data, is it possible to have the output include the barcode where the SNP was found?

gneedle1 avatar Feb 12 '24 15:02 gneedle1

hi @gneedle1 ,

Currently it is not possible to have the barcodes output in the VCF. Thank you for the request.

kishwarshafin avatar Feb 12 '24 21:02 kishwarshafin

Would it be valid to subset the original BAM by barcode and run each subset individually?

gneedle1 avatar Feb 12 '24 22:02 gneedle1

Hi @gneedle1

It depends on the type of experiment. If the barcodes are the same sample and you are trying to get at some other specific property (e.g. cell type or preparation), then it's a question of sequencing coverage. If you will have enough coverage to make good quality calls within the reads of a single barcode (something like at least 15x-20x depending on your tolerance for errors), then subsetting by barcode could be reasonable. If you have less coverage, then the effects of reducing coverage will likely be much larger than whatever effect you are trying to detect.

If the barcodes separate different samples (i.e. those with different germline DNA), then the correct thing is to separate by barcode.

I would need a little more information about the nature of the samples and what you are looking for to give you a more direct opinion.

AndrewCarroll avatar Feb 20 '24 18:02 AndrewCarroll

The experimental setup is roughly:

  1. Hairy cell leukemia cells were isolated from the blood of patients.
  2. Single-cell cDNAs were synthesized and barcoded by 10X Genomics platform (cDNAs of each cell are barcoded individually).
  3. Direct RNA seq will be run for the single-cell cDNAs from one patient on the flow cell of ONT sequencer.
  4. The goal is to call mutations for the cDNA in individual cells and then build single-cell phylogeny based on mutations.

gneedle1 avatar Feb 21 '24 14:02 gneedle1

I see, so in this case, each barcode corresponds to a specific cell. If you have sufficient coverage, you can split by barcode and make the call, it's really just a question of coverage. For your purpose, you can split by barcodes.

AndrewCarroll avatar Mar 12 '24 05:03 AndrewCarroll

Thanks @AndrewCarroll for the follow-up answer. I will now close this issue.

pichuan avatar Mar 12 '24 05:03 pichuan