cuteSV icon indicating copy to clipboard operation
cuteSV copied to clipboard

Achieve the INV fragments

Open AndyLy2Zy opened this issue 1 year ago • 13 comments

Hi @tjiangHIT,

As we all known that many of the software, including cuteSV, call INV just give the predicted break-points for INV events. So, I just wander that if we can achieve the inversion fragments according to the cuteSV result?

Best! Andy

AndyLy2Zy avatar Mar 25 '23 20:03 AndyLy2Zy

Hello @AndyLy2Zy,

Sorry for replying so late. If I do this work, I would 1) extract the reads supporting one INV from cuteSV output, 2) generate the consensus fragment by tools like spoa and abpoa, 3) realign the consensus sequence to reference genome; 4) call the precise INV through tools like syri and pav in order to obtain the INV sequence. Hope this can help you!

Best, Tao

tjiangHIT avatar Mar 29 '23 02:03 tjiangHIT

Hi @tjiangHIT,

Thanks for your replay. So there is no way to directly obtain the INV sequence from the cuteSV ? One more question, I wonder if the cuteSV can be used for the contigs? I assembled the reads to generated contigs since I don't know how to obtain the INV sequence.

Best, Andy

AndyLy2Zy avatar Mar 29 '23 16:03 AndyLy2Zy

Hello @AndyLy2Zy,

Actually, according to the breakpoints of the INV that cuteSV reportes, you can extract the sequence that involved in the INV from the reference genome, and modify it based on the reverse order. If you do like this, you will obtain the INV sequence. However, the sequence might be similar to the real one rather than be it. It is mainly due to the predicted breakpoints of the INV as you know. For your next question, cuteSV can be used to detect SV from the assembly alignments, you can see the wiki page about diploid-assembly-based SV detection using cuteSV here. When you obtain the reported INV by cuteSV, just follow the extraction opreation mentioned above to acquire the INV sequence.

Best, Tao

tjiangHIT avatar Mar 30 '23 00:03 tjiangHIT

Hi @tjiangHIT, Thanks for your friendly reply. I used the meghit program to assemble the Yeast genome and achieved the contigs. After that, I used the minimap2 to map the contigs to Yeast genome reference obtained from USCS database. Next, I used the cuteSV to call SVs just as the wiki page about diploid-assembly-based SV detection using cuteSV. However, There have no SVs results generated. The following is the commands: minimap2 --paf-no-hit -a -x asm5 --cs -r 2k -t 2 ~/data/refgenome/sacCer3_ucsc.fa final.contigs.fa > final.contigs.sam nohup cuteSV final.contigs.sort.bam ~/data/refgenome/sacCer3_ucsc.fa final.contigs.sort.bam2.vcf test/ -p -1 -L -1 &

The results for cuteSV image

Any suggestions will be appreciate !

Best, Andy

AndyLy2Zy avatar Mar 30 '23 05:03 AndyLy2Zy

Hello @AndyLy2Zy,

Please add the parameter --retain_work_dir and check whether there are SV signatures in the directory or not. Look forward to your reply.

Tao

tjiangHIT avatar Mar 30 '23 06:03 tjiangHIT

Hi @tjiangHIT,

I added the parameter --retain_work_dir. The results showed that there have some SVs. image

Best, Andy

AndyLy2Zy avatar Mar 30 '23 19:03 AndyLy2Zy

Hello @AndyLy2Zy,

Please set the parameter "--min_support" as 1 and try again.

Best, Tao

tjiangHIT avatar Mar 31 '23 00:03 tjiangHIT

Hi @tjiangHIT,

Although I have added the parameter "--min_support 1" or "-s 1", there is no difference at all. Command: cuteSV final.contigs.sort.bam ~/data/refgenome/sacCer3_ucsc.fa final.contigs.sort.bam2.vcf ./ -p -1 -L -1 --min_support 1 --retain_work_dir Meanwhile there showed another error: image

Best, Andy

AndyLy2Zy avatar Mar 31 '23 00:03 AndyLy2Zy

Hello, @AndyLy2Zy

The error may occurred when extracting sequences from the given reference. You can check whether the chromosomes in the input bam file are consistent with the input reference, that is whether the arised error chromosome 'k119_100' is contained in the input bam file or the input reference. In addition, can you provide part of your dataset if available? That will be helpful for us to check the reason. The datasets can be sent via email: [email protected].

Best, Shuqi

Meltpinkg avatar Apr 03 '23 05:04 Meltpinkg

Dear Shuqi,

I'm so sorry for the later reply. I have sent the assembled genome to you by email.

Best, Andy

AndyLy2Zy avatar Apr 05 '23 00:04 AndyLy2Zy

Hi @tjiangHIT,

I note that there are only have the information about the Number of read support this record for each SV events in the result VCF file. I wander if there have some ways to identify which reads support the events? The ideal behavior is to directly achieve the support reads. image

Best, Andy

AndyLy2Zy avatar Apr 24 '23 23:04 AndyLy2Zy

Hello, @AndyLy2Zy You can add the parameter "--report_readid" which enables to report supporting read ids for each SV.

Best, Shuqi

Meltpinkg avatar Apr 25 '23 06:04 Meltpinkg

Hello, @AndyLy2Zy

Sorry for my late reply. I received the assembled genome and used the latest version of cuteSV to apply SV detection. The commands I used are shown below:

minimap2 --paf-no-hit -a -x asm5 --cs -r 2k -t 2 sacCer3.fa.gz final.contigs.fa > final.contigs.sam
samtools sort final.contigs.sam | samtools view -bS -o final.contigs.sorted.bam && samtools index final.contigs.sorted.bam
cuteSV final.contigs.sorted.bam sacCer3.fa final.contigs.sort.bam2.vcf ./ -p -1 -L -1 -s 1

The running didn't show any errors and finished in 1 seconds. And there are multiple types of SVs in the output VCF file.

> cat final.contigs.sort.bam2.vcf | grep -v '#' | awk -F 'SVTYPE=' '{print $2}' | awk -F ';' '{print $1}' |sort |uniq -c
     17 BND
     35 DEL
      9 INS
     10 INV

You can make sure the version used is the latest version in GitHub:

git clone https://github.com/tjiangHIT/cuteSV.git && cd cuteSV/ && python setup.py install

Best, Shuqi

Meltpinkg avatar Apr 25 '23 09:04 Meltpinkg