svmu icon indicating copy to clipboard operation
svmu copied to clipboard

confused about the results: sv.txt and cnv.txt

Open QianghuiZhu opened this issue 2 years ago • 8 comments

Hi! SVMU is a pretty good tools to detect SVs.

But I am be confused about the result files, the sv.xxx.txt and cnv.xxx.txt.

In my understanding, CNVs are some types of SVs (simple SV type: deletion (DEL), duplication (DUP), inversion (INV), insertion (INS), translocation (TRA). While CNV are unbalanced variants, including: DEL, and DUP, sometimes also including INS).

But in the result files, SVs and CNVs are divided into two files. While some interval are overlapped, there are also some interval unique in cnv.xxx.txt or sv.xxx.txt. So, while considered for downstream analysis, may I merge these two files together to get more SV sites (merge overlapped interval)?

Thanks for a lot!

QianghuiZhu avatar Nov 03 '22 12:11 QianghuiZhu

The SV.xx.txt file has a comprehensive list of SVs, but some CNVs, especially in highly repetitive sequences can be missed and not present in this file. It's probably okay to combine CNVs from the two files, but do check a few unique CNVs from the cnv file to make sure they are true CNVs. I hope this is helpful. Let me know if you have any other questions.

On Thu, Nov 3, 2022 at 7:19 AM Hui @.***> wrote:

Hi! SVMU is a pretty good tools to detect SVs.

But I am be confused about the result files, the sv.xxx.txt and cnv.xxx.txt.

In my understanding, CNVs are some types of SVs (simple SV type: deletion (DEL), duplication (DUP), inversion (INV), insertion (INS), translocation (TRA). While CNV are unbalanced variants, including: DEL, and DUP, sometimes also including INS).

But in the result files, SVs and CNVs are divided into two files. While some interval are overlapped, there are also some interval unique in cnv.xxx.txt or sv.xxx.txt. So, while considered for downstream analysis, may I merge these two files together to get more SV sites (merge overlapped interval)?

Thanks for a lot!

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mahulchak/svmu/issues/25__;!!CzAuKJ42GuquVTTmVmPViYEvSg!IVlbo0hGkh6BSrneOoIUX6bLEMyp_5D7Fj16K_dAc46Zeoj3LsM1Ki_sHbi_XUOB0QmLPmjJSzPz1p0aFeoBRve_$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABZQH2B4QXZ6WVBIW2H7ZE3WGOUVRANCNFSM6AAAAAARWCNDXQ__;!!CzAuKJ42GuquVTTmVmPViYEvSg!IVlbo0hGkh6BSrneOoIUX6bLEMyp_5D7Fj16K_dAc46Zeoj3LsM1Ki_sHbi_XUOB0QmLPmjJSzPz1p0aFZp_oRUO$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Mahul Chakraborty Assistant Professor Department of Biology Texas A&M University Phone: 949 824 9559 Fax: 979-845-2891 Website: https://mahulchakraborty.wordpress.com/ Github: https://github.com/mahulchak

mahulchak avatar Nov 03 '22 18:11 mahulchak

WIth great thanks to you!

I'll try to merge these two files and filter out some overlapped sites.

QianghuiZhu avatar Nov 04 '22 01:11 QianghuiZhu

Hi! I'm sorry to bother you, but I have a small problem about the result files are whether 0-based or 1-based coordinations? Thank you!

QianghuiZhu avatar Jan 08 '23 08:01 QianghuiZhu

To the best of my knowledge, they're 1-based.

On Sun, Jan 8, 2023 at 2:26 AM Hui @.***> wrote:

Hi! I'm sorry to bother you, but I have a small problem about the result files are whether 0-based or 1-based coordinations? Thank you!

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mahulchak/svmu/issues/25*issuecomment-1374756852__;Iw!!CzAuKJ42GuquVTTmVmPViYEvSg!IksNPAw-CmPiwET9zYTug_voKMh7AmnQOgXyM7Dx5lUVtZGyF9aAVoqv_i61Cqo6h88OsJn-3shRUlKdudcGdqJe$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABZQH2ETRS7UVSYBUI5M6ADWRJ22RANCNFSM6AAAAAARWCNDXQ__;!!CzAuKJ42GuquVTTmVmPViYEvSg!IksNPAw-CmPiwET9zYTug_voKMh7AmnQOgXyM7Dx5lUVtZGyF9aAVoqv_i61Cqo6h88OsJn-3shRUlKduVRli_O4$ . You are receiving this because you commented.Message ID: @.***>

-- Mahul Chakraborty Assistant Professor Department of Biology Texas A&M University Phone: 949 824 9559 Fax: 979-845-2891 Website: https://mahulchakraborty.wordpress.com/ Github: https://github.com/mahulchak

mahulchak avatar Jan 11 '23 04:01 mahulchak

Thank you again for your reply.

QianghuiZhu avatar Jan 11 '23 06:01 QianghuiZhu

Hi, what do the columns in the coords and cnv files represent? The files were produced without headers.

Maxim-Karpov avatar Mar 10 '23 12:03 Maxim-Karpov

I do not get header INFO, either. In my thought, they may: ref_chr ref_start ref_end query_chr query_start query_end ref_copy query_copy. I do not know about the last two columns. while I only use the coordinates.

QianghuiZhu avatar Mar 13 '23 03:03 QianghuiZhu

I do not get header INFO, either. In my thought, they may: ref_chr ref_start ref_end query_chr query_start query_end ref_copy query_copy. I do not know about the last two columns. while I only use the coordinates.

Thank you! I hope the developer can chime in on this as well.

Maxim-Karpov avatar Mar 13 '23 19:03 Maxim-Karpov