pipeline-structural-variation
pipeline-structural-variation copied to clipboard
VCF output not reporting translocations/inversions/duplications
Hi,
I have used the command below to carry out a variant call with nanopore reads.
snakemake --snakefile /home/doresegu/scratch/apps/Tools/pipeline-structural-variation-2.0.2/Snakefile -p call --config input_fastq=/home/doresegu/scratch/private/Analysis/IsolatesToAssemble/sks047.fastq reference_fasta=/home/doresegu/scratch/private/PublicationAnal/NewGenomes/Cultured.fasta sample_name=sks047VsCultChecks --cores 32
This worked great and produced a VCF file however, I noticed that only insertions and deletions are reported. I looked in the cutesv_tmp.vcf which is the raw output of cuteSV to see that other SV forms were reported i.e. INV, TRA, DUP, BND.
So after some searching I managed to edit the Snakefile "filter_vcf" rule to include this:
sv_types = config.get("sv_type", "DEL INS TRA INV DUP"),
as it only had DEL INS previously.
However this still didn't work. It appears that the filtering is happening with the bcftools view command:
bcftools view -i '(SVTYPE = "DEL" || SVTYPE = "INV" || SVTYPE = "DUP" || SVTYPE = "INS" ) && ABS(SVLEN) > 30 && ABS(SVLEN) < 100000 && INFO/RE >= 8' sks047VsCultChecks/sv_calls/sks047VsCultChecks_cutesv_tmp.vcf > sks047VsCultChecks/sv_calls/sks047VsCultChecks_cutesv_filtered_tmp.vcf
I can't find this command in the snakefile at all. So I manually carried out the bcftools command:
bcftools view -i '(SVTYPE = "DEL" || SVTYPE = "INV" || SVTYPE = "DUP" || SVTYPE = "INS" || SVTYPE = "TRA" || SVTYPE = "BND" ) && ABS(SVLEN) > 30 && ABS(SVLEN) < 100000 && INFO/RE >= 8' sks047VsCultChecks/sv_calls/sks047VsCultChecks_cutesv_tmp.vcf > sks047VsCultChecks/sv_calls/sks047VsCultChecks_cutesv_filtered_tmp.vcf
This still did not change anything however after removing the "INFO/RE >= 8" condition, I did see these forms in the output.
So my question is: where is the bftools command so that I can change it (to add more SVTYPEs) rather than having to manually carry out the filtration? I understand that the INFO/RE is dependent on mosdepth outputs so I won't need to change that. I just want to make sure it isn't missing forms that still fit within the thresholds.
Hi @damioresegun
@philres should be able to confirm, but I believe that at the time of writing this piepline, cuteSV categorised events as only INS
or DEL
. We'll have to check whether this is the case for the latest version (if we are not using it already).
Ok thanks. An update on my end: I have been able to see the different SVTYPEs are characterised in the raw cuteSV output though it is the read depth (of 8 in this case) that stops them from being called in the final variant file. So I guess the pipeline works as designed