cuteSV
cuteSV copied to clipboard
Achieve the INV fragments
Hi @tjiangHIT,
As we all known that many of the software, including cuteSV, call INV just give the predicted break-points for INV events. So, I just wander that if we can achieve the inversion fragments according to the cuteSV result?
Best! Andy
Hello @AndyLy2Zy,
Sorry for replying so late. If I do this work, I would 1) extract the reads supporting one INV from cuteSV output, 2) generate the consensus fragment by tools like spoa and abpoa, 3) realign the consensus sequence to reference genome; 4) call the precise INV through tools like syri and pav in order to obtain the INV sequence. Hope this can help you!
Best, Tao
Hi @tjiangHIT,
Thanks for your replay. So there is no way to directly obtain the INV sequence from the cuteSV ? One more question, I wonder if the cuteSV can be used for the contigs? I assembled the reads to generated contigs since I don't know how to obtain the INV sequence.
Best, Andy
Hello @AndyLy2Zy,
Actually, according to the breakpoints of the INV that cuteSV reportes, you can extract the sequence that involved in the INV from the reference genome, and modify it based on the reverse order. If you do like this, you will obtain the INV sequence. However, the sequence might be similar to the real one rather than be it. It is mainly due to the predicted breakpoints of the INV as you know. For your next question, cuteSV can be used to detect SV from the assembly alignments, you can see the wiki page about diploid-assembly-based SV detection using cuteSV here. When you obtain the reported INV by cuteSV, just follow the extraction opreation mentioned above to acquire the INV sequence.
Best, Tao
Hi @tjiangHIT, Thanks for your friendly reply. I used the meghit program to assemble the Yeast genome and achieved the contigs. After that, I used the minimap2 to map the contigs to Yeast genome reference obtained from USCS database. Next, I used the cuteSV to call SVs just as the wiki page about diploid-assembly-based SV detection using cuteSV. However, There have no SVs results generated. The following is the commands: minimap2 --paf-no-hit -a -x asm5 --cs -r 2k -t 2 ~/data/refgenome/sacCer3_ucsc.fa final.contigs.fa > final.contigs.sam nohup cuteSV final.contigs.sort.bam ~/data/refgenome/sacCer3_ucsc.fa final.contigs.sort.bam2.vcf test/ -p -1 -L -1 &
The results for cuteSV
Any suggestions will be appreciate !
Best, Andy
Hello @AndyLy2Zy,
Please add the parameter --retain_work_dir and check whether there are SV signatures in the directory or not. Look forward to your reply.
Tao
Hi @tjiangHIT,
I added the parameter --retain_work_dir. The results showed that there have some SVs.
Best, Andy
Hello @AndyLy2Zy,
Please set the parameter "--min_support" as 1 and try again.
Best, Tao
Hi @tjiangHIT,
Although I have added the parameter "--min_support 1" or "-s 1", there is no difference at all.
Command: cuteSV final.contigs.sort.bam ~/data/refgenome/sacCer3_ucsc.fa final.contigs.sort.bam2.vcf ./ -p -1 -L -1 --min_support 1 --retain_work_dir
Meanwhile there showed another error:
Best, Andy
Hello, @AndyLy2Zy
The error may occurred when extracting sequences from the given reference. You can check whether the chromosomes in the input bam file are consistent with the input reference, that is whether the arised error chromosome 'k119_100' is contained in the input bam file or the input reference. In addition, can you provide part of your dataset if available? That will be helpful for us to check the reason. The datasets can be sent via email: [email protected].
Best, Shuqi
Dear Shuqi,
I'm so sorry for the later reply. I have sent the assembled genome to you by email.
Best, Andy
Hi @tjiangHIT,
I note that there are only have the information about the Number of read support this record for each SV events in the result VCF file. I wander if there have some ways to identify which reads support the events? The ideal behavior is to directly achieve the support reads.
Best, Andy
Hello, @AndyLy2Zy You can add the parameter "--report_readid" which enables to report supporting read ids for each SV.
Best, Shuqi
Hello, @AndyLy2Zy
Sorry for my late reply. I received the assembled genome and used the latest version of cuteSV to apply SV detection. The commands I used are shown below:
minimap2 --paf-no-hit -a -x asm5 --cs -r 2k -t 2 sacCer3.fa.gz final.contigs.fa > final.contigs.sam
samtools sort final.contigs.sam | samtools view -bS -o final.contigs.sorted.bam && samtools index final.contigs.sorted.bam
cuteSV final.contigs.sorted.bam sacCer3.fa final.contigs.sort.bam2.vcf ./ -p -1 -L -1 -s 1
The running didn't show any errors and finished in 1 seconds. And there are multiple types of SVs in the output VCF file.
> cat final.contigs.sort.bam2.vcf | grep -v '#' | awk -F 'SVTYPE=' '{print $2}' | awk -F ';' '{print $1}' |sort |uniq -c
17 BND
35 DEL
9 INS
10 INV
You can make sure the version used is the latest version in GitHub:
git clone https://github.com/tjiangHIT/cuteSV.git && cd cuteSV/ && python setup.py install
Best, Shuqi