TCAG-WGS-CNV-workflow
TCAG-WGS-CNV-workflow copied to clipboard
Merging ERDS and CNVnator files error
Hi,
I followed the script process_cnvs.erds+.sh, and I get the following error. Could you please help me out with it?
/scratch/SOFTWARE/TCAG-WGS-CNV-workflow/format_cnvnator_results.py /scratch/SOFTWARE/TCAG-WGS-CNV-workflow/format_erds_results.py /scratch/SOFTWARE/TCAG-WGS-CNV-workflow/merge_cnvnator_results.py /scratch/SOFTWARE/TCAG-WGS-CNV-workflow/merge_erds_results.py /scratch/SOFTWARE/TCAG-WGS-CNV-workflow/add_features.py /scratch/SOFTWARE/TCAG-WGS-CNV-workflow/hg19_gap.bed
Set-up..
Found erds/original, creating erds/formatted
Found cnvn/original, creating cnvn/formatted_filtered
Formatting, filtering and merging cnvn output
Warning: Header not found, using default..
Traceback (most recent call last):
File "/scratch/SOFTWARE/TCAG-WGS-CNV-workflow/format_cnvnator_results.py", line 69, in
Thanks!
Hi,
Unfortunately I cannot tell offhand from the error messages. Are you able to share your CNVnator and ERDS files that you are using so I can test it myself?
Cheers,
Brett
Hi,
Thanks for your reply!
I have attached a snippet of the files I am working within this email. Please let me know if you need anything else.
Best,
On Mon, Jun 1, 2020 at 6:33 AM Brett Trost [email protected] wrote:
Hi,
Unfortunately I cannot tell offhand from the error messages. Are you able to share your CNVnator and ERDS files that you are using so I can test it myself?
Cheers,
Brett
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bjtrost/TCAG-WGS-CNV-workflow/issues/11#issuecomment-636580852, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOU5LL4J435GEB352TIX7CDRUMHOPANCNFSM4NPDCJEA .
-- Regards Ghausia Begum
##fileformat=VCFv4.0 ##fileDate=20190320 ##reference=1000GenomesPhase3_decoy-GRCh37 ##source=CNVnator ##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record"> ##INFO=<ID=IMPRECISE,Number=0,Type=Flag,Description="Imprecise structural variation"> ##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles"> ##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant"> ##INFO=<ID=natorRD,Number=1,Type=Float,Description="Normalized RD"> ##INFO=<ID=natorP1,Number=1,Type=Float,Description="e-val by t-test"> ##INFO=<ID=natorP2,Number=1,Type=Float,Description="e-val by Gaussian tail"> ##INFO=<ID=natorP3,Number=1,Type=Float,Description="e-val by t-test (middle)"> ##INFO=<ID=natorP4,Number=1,Type=Float,Description="e-val by Gaussian tail (middle)"> ##INFO=<ID=natorQ0,Number=1,Type=Float,Description="Fraction of reads with 0 mapping quality"> ##INFO=<ID=natorPE,Number=1,Type=Integer,Description="Number of paired-ends support the event"> ##INFO=<ID=SAMPLES,Number=.,Type=String,Description="Sample genotyped to have the variant"> ##ALT=<ID=DEL,Description="Deletion"> ##ALT=<ID=DUP,Description="Duplication"> #CHROM POS ID REF ALT QUAL FILTER INFO 1 1 CNVnator_del_1 N <DEL> . PASS END=10000;SVTYPE=DEL;SVLEN=-10000;IMPRECISE;natorRD=0;natorP1=1.59373e-11;natorP2=1.56125e-280;natorP3=1.99216e-11;natorP4=1.11272e-222;natorQ0=-1 1 150001 CNVnator_del_2 T <DEL> . PASS END=155500;SVTYPE=DEL;SVLEN=-5500;IMPRECISE;natorRD=0.578662;natorP1=0.451236;natorP2=1.82038e-07;natorP3=279.558;natorP4=0.14135;natorQ0=0.883817 1 176001 CNVnator_del_3 G <DEL> . PASS END=227500;SVTYPE=DEL;SVLEN=-51500;IMPRECISE;natorRD=0.016173;natorP1=3.09461e-12;natorP2=5.19618e-152;natorP3=3.21965e-12;natorP4=0;natorQ0=0.967427 1 231001 CNVnator_dup_4 C <DUP> . PASS END=242000;SVTYPE=DUP;SVLEN=11000;IMPRECISE;natorRD=1.91092;natorP1=0.00214848;natorP2=0.0216277;natorP3=0.0357276;natorP4=2.27708;natorQ0=0.284183 1 267501 CNVnator_del_5 A <DEL> . PASS END=318000;SVTYPE=DEL;SVLEN=-50500;IMPRECISE;natorRD=0.00471782;natorP1=3.15589e-12;natorP2=0;natorP3=3.28603e-12;natorP4=0;natorQ0=0.248649 1 326501 CNVnator_del_6 G <DEL> . PASS END=327500;SVTYPE=DEL;SVLEN=-1000;IMPRECISE;natorRD=0.374265;natorP1=49058.3;natorP2=0.0327534;natorP3=1;natorP4=1;natorQ0=1 1 386501 CNVnator_del_7 C <DEL> . PASS END=521500;SVTYPE=DEL;SVLEN=-135000;IMPRECISE;natorRD=0.449698;natorP1=1.18054e-12;natorP2=2.871e+09;natorP3=1.19829e-12;natorP4=2.871e+09;natorQ0=0.984279 1 1011501 CNVnator_del_8 T <DEL> . PASS END=1013500;SVTYPE=DEL;SVLEN=-2000;IMPRECISE;natorRD=0.456255;natorP1=1301.48;natorP2=0.010141;natorP3=1;natorP4=1;natorQ0=0.386617 1 1285501 CNVnator_del_9 G <DEL> . PASS END=1287000;SVTYPE=DEL;SVLEN=-1500;IMPRECISE;natorRD=0.214173;natorP1=13978.1;natorP2=2.05364e-06;natorP3=1;natorP4=1;natorQ0=0.182432 1 2053001 CNVnator_del_10 G <DEL> . PASS END=2055500;SVTYPE=DEL;SVLEN=-2500;IMPRECISE;natorRD=0.219492;natorP1=377.504;natorP2=1.37905e-08;natorP3=1;natorP4=1;natorQ0=0.158371 1 2583001 CNVnator_dup_11 A <DUP> . PASS END=2616000;SVTYPE=DUP;SVLEN=33000;IMPRECISE;natorRD=4.74277;natorP1=0;natorP2=0.00145508;natorP3=0;natorP4=0.00809184;natorQ0=0.780204 1 2634001 CNVnator_del_12 G <DEL> . PASS END=2684500;SVTYPE=DEL;SVLEN=-50500;IMPRECISE;natorRD=0.00391504;natorP1=3.15589e-12;natorP2=0;natorP3=3.28603e-12;natorP4=0;natorQ0=0.934783 1 3845501 CNVnator_del_13 N <DEL> . PASS END=3995500;SVTYPE=DEL;SVLEN=-150000;IMPRECISE;natorRD=0.00149144;natorP1=1.06248e-12;natorP2=0;natorP3=1.07684e-12;natorP4=0;natorQ0=0 1 12930501 CNVnator_dup_14 T <DUP> . PASS END=12940500;SVTYPE=DUP;SVLEN=10000;IMPRECISE;natorRD=1.7555;natorP1=0.00202992;natorP2=0.06397;natorP3=0.04097;natorP4=8.63749;natorQ0=0.104094 1 13027001 CNVnator_del_15 T <DEL> . PASS END=13038500;SVTYPE=DEL;SVLEN=-11500;IMPRECISE;natorRD=0.666445;natorP1=0.00467794;natorP2=3400.01;natorP3=0.0715599;natorP4=36490.9;natorQ0=0.839578 1 13053001 CNVnator_del_16 N <DEL> . PASS END=13116000;SVTYPE=DEL;SVLEN=-63000;IMPRECISE;natorRD=0.110352;natorP1=2.52972e-12;natorP2=1.19574e-54;natorP3=2.61266e-12;natorP4=4.20872e-163;natorQ0=0.854849 1 13117001 CNVnator_del_17 C <DEL> . PASS END=13132500;SVTYPE=DEL;SVLEN=-15500;IMPRECISE;natorRD=0.262738;natorP1=1.02821e-11;natorP2=3.97224e-33;natorP3=5.90269e-11;natorP4=1.00045e-27;natorQ0=0.918182 1 13155501 CNVnator_del_18 A <DEL> . PASS END=13166000;SVTYPE=DEL;SVLEN=-10500;IMPRECISE;natorRD=0.556388;natorP1=0.00728893;natorP2=7.79484e+07;natorP3=0.24383;natorP4=1.5493e+08;natorQ0=0.720264 1 13220001 CNVnator_del_19 N <DEL> . PASS END=13443000;SVTYPE=DEL;SVLEN=-223000;IMPRECISE;natorRD=0.4124;natorP1=7.14675e-13;natorP2=2.8694e+09;natorP3=7.21143e-13;natorP4=2.86941e+09;natorQ0=0.771207
Ah - I suspect it is because you are using the VCF version of the CNVnator calls rather than the tab-delimited version. The script expects the tab-delimited version. (However, the ERDS file is expected to be the VCF). I will edit the docs to make this clearer. Please let me know if this works. Sorry for the confusion!
Cheers,
Brett
I am using vcf version for erds variants and txt version for cnvnator variants. The files I sent are supposed to be 006.erds.cvf and 006.calls.txt (I changed the format while sending it to you, sorry for the confusion)
I used the correct format for each file yet I get the error.
On 1 Jun 2020, at 7:58 PM, Brett Trost [email protected] wrote: Ah - I suspect it is because you are using the VCF version of the CNVnator calls rather than the tab-delimited version. The script expects the tab-delimited version. (However, the ERDS file is expected to be the VCF). I will edit the docs to make this clearer. Please let me know if this works. Sorry for the confusion!
Cheers,
Brett
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Can you send me the actual files you used? You could include just the headers plus the first three non-header lines in each file. I think this should be enough to diagnose the problem.
Cheers,
Brett
The files I sent are the actual files: header plus a few lines, just the name of the file is different. Nevertheless, I have attached a copy of it in this email. 006.calls.txt are the variants called from CNVnator. 006.erds.vcf are the variants called from ERDS.
I hope this helps.
On Tue, Jun 2, 2020 at 12:56 AM Brett Trost [email protected] wrote:
Can you send me the actual files you used? You could include just the headers plus the first three non-header lines in each file. I think this should be enough to diagnose the problem.
Cheers,
Brett
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bjtrost/TCAG-WGS-CNV-workflow/issues/11#issuecomment-637097315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOU5LL7EFK35RE5HTBMMIH3RUQIYVANCNFSM4NPDCJEA .
-- Regards Ghausia Begum
##fileformat=VCFv4.0 ##fileDate=20190320 ##reference=1000GenomesPhase3_decoy-GRCh37 ##source=CNVnator ##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record"> ##INFO=<ID=IMPRECISE,Number=0,Type=Flag,Description="Imprecise structural variation"> ##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles"> ##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant"> ##INFO=<ID=natorRD,Number=1,Type=Float,Description="Normalized RD"> ##INFO=<ID=natorP1,Number=1,Type=Float,Description="e-val by t-test"> ##INFO=<ID=natorP2,Number=1,Type=Float,Description="e-val by Gaussian tail"> ##INFO=<ID=natorP3,Number=1,Type=Float,Description="e-val by t-test (middle)"> ##INFO=<ID=natorP4,Number=1,Type=Float,Description="e-val by Gaussian tail (middle)"> ##INFO=<ID=natorQ0,Number=1,Type=Float,Description="Fraction of reads with 0 mapping quality"> ##INFO=<ID=natorPE,Number=1,Type=Integer,Description="Number of paired-ends support the event"> ##INFO=<ID=SAMPLES,Number=.,Type=String,Description="Sample genotyped to have the variant"> ##ALT=<ID=DEL,Description="Deletion"> ##ALT=<ID=DUP,Description="Duplication"> #CHROM POS ID REF ALT QUAL FILTER INFO 1 1 CNVnator_del_1 N <DEL> . PASS END=10000;SVTYPE=DEL;SVLEN=-10000;IMPRECISE;natorRD=0;natorP1=1.59373e-11;natorP2=1.56125e-280;natorP3=1.99216e-11;natorP4=1.11272e-222;natorQ0=-1 1 150001 CNVnator_del_2 T <DEL> . PASS END=155500;SVTYPE=DEL;SVLEN=-5500;IMPRECISE;natorRD=0.578662;natorP1=0.451236;natorP2=1.82038e-07;natorP3=279.558;natorP4=0.14135;natorQ0=0.883817 1 176001 CNVnator_del_3 G <DEL> . PASS END=227500;SVTYPE=DEL;SVLEN=-51500;IMPRECISE;natorRD=0.016173;natorP1=3.09461e-12;natorP2=5.19618e-152;natorP3=3.21965e-12;natorP4=0;natorQ0=0.967427 1 231001 CNVnator_dup_4 C <DUP> . PASS END=242000;SVTYPE=DUP;SVLEN=11000;IMPRECISE;natorRD=1.91092;natorP1=0.00214848;natorP2=0.0216277;natorP3=0.0357276;natorP4=2.27708;natorQ0=0.284183 1 267501 CNVnator_del_5 A <DEL> . PASS END=318000;SVTYPE=DEL;SVLEN=-50500;IMPRECISE;natorRD=0.00471782;natorP1=3.15589e-12;natorP2=0;natorP3=3.28603e-12;natorP4=0;natorQ0=0.248649 1 326501 CNVnator_del_6 G <DEL> . PASS END=327500;SVTYPE=DEL;SVLEN=-1000;IMPRECISE;natorRD=0.374265;natorP1=49058.3;natorP2=0.0327534;natorP3=1;natorP4=1;natorQ0=1 1 386501 CNVnator_del_7 C <DEL> . PASS END=521500;SVTYPE=DEL;SVLEN=-135000;IMPRECISE;natorRD=0.449698;natorP1=1.18054e-12;natorP2=2.871e+09;natorP3=1.19829e-12;natorP4=2.871e+09;natorQ0=0.984279 1 1011501 CNVnator_del_8 T <DEL> . PASS END=1013500;SVTYPE=DEL;SVLEN=-2000;IMPRECISE;natorRD=0.456255;natorP1=1301.48;natorP2=0.010141;natorP3=1;natorP4=1;natorQ0=0.386617 1 1285501 CNVnator_del_9 G <DEL> . PASS END=1287000;SVTYPE=DEL;SVLEN=-1500;IMPRECISE;natorRD=0.214173;natorP1=13978.1;natorP2=2.05364e-06;natorP3=1;natorP4=1;natorQ0=0.182432 1 2053001 CNVnator_del_10 G <DEL> . PASS END=2055500;SVTYPE=DEL;SVLEN=-2500;IMPRECISE;natorRD=0.219492;natorP1=377.504;natorP2=1.37905e-08;natorP3=1;natorP4=1;natorQ0=0.158371 1 2583001 CNVnator_dup_11 A <DUP> . PASS END=2616000;SVTYPE=DUP;SVLEN=33000;IMPRECISE;natorRD=4.74277;natorP1=0;natorP2=0.00145508;natorP3=0;natorP4=0.00809184;natorQ0=0.780204 1 2634001 CNVnator_del_12 G <DEL> . PASS END=2684500;SVTYPE=DEL;SVLEN=-50500;IMPRECISE;natorRD=0.00391504;natorP1=3.15589e-12;natorP2=0;natorP3=3.28603e-12;natorP4=0;natorQ0=0.934783 1 3845501 CNVnator_del_13 N <DEL> . PASS END=3995500;SVTYPE=DEL;SVLEN=-150000;IMPRECISE;natorRD=0.00149144;natorP1=1.06248e-12;natorP2=0;natorP3=1.07684e-12;natorP4=0;natorQ0=0 1 12930501 CNVnator_dup_14 T <DUP> . PASS END=12940500;SVTYPE=DUP;SVLEN=10000;IMPRECISE;natorRD=1.7555;natorP1=0.00202992;natorP2=0.06397;natorP3=0.04097;natorP4=8.63749;natorQ0=0.104094 1 13027001 CNVnator_del_15 T <DEL> . PASS END=13038500;SVTYPE=DEL;SVLEN=-11500;IMPRECISE;natorRD=0.666445;natorP1=0.00467794;natorP2=3400.01;natorP3=0.0715599;natorP4=36490.9;natorQ0=0.839578 1 13053001 CNVnator_del_16 N <DEL> . PASS END=13116000;SVTYPE=DEL;SVLEN=-63000;IMPRECISE;natorRD=0.110352;natorP1=2.52972e-12;natorP2=1.19574e-54;natorP3=2.61266e-12;natorP4=4.20872e-163;natorQ0=0.854849 1 13117001 CNVnator_del_17 C <DEL> . PASS END=13132500;SVTYPE=DEL;SVLEN=-15500;IMPRECISE;natorRD=0.262738;natorP1=1.02821e-11;natorP2=3.97224e-33;natorP3=5.90269e-11;natorP4=1.00045e-27;natorQ0=0.918182 1 13155501 CNVnator_del_18 A <DEL> . PASS END=13166000;SVTYPE=DEL;SVLEN=-10500;IMPRECISE;natorRD=0.556388;natorP1=0.00728893;natorP2=7.79484e+07;natorP3=0.24383;natorP4=1.5493e+08;natorQ0=0.720264 1 13220001 CNVnator_del_19 N <DEL> . PASS END=13443000;SVTYPE=DEL;SVLEN=-223000;IMPRECISE;natorRD=0.4124;natorP1=7.14675e-13;natorP2=2.8694e+09;natorP3=7.21143e-13;natorP4=2.86941e+09;natorQ0=0.771207
For some reason, I cannot see the attachments. Can you send them as a regular e-mail to [email protected]? Thanks!