varsim icon indicating copy to clipboard operation
varsim copied to clipboard

VarSim error: ALT column is malformated: Found illegal character: 46

Open bioinfo89 opened this issue 3 years ago • 0 comments

Hi, I am using Varsim for simulation and following is the command I am using:

python varsim/varsim.py --vc_in_vcf varsim/All_20170710_wdchr+header.vcf --sv_insert_seq varsim/insert_seq.txt --sv_dgv GRCh37_hg19_supportingvariants_2020-02-25_wdchr.txt --reference hg19/chr22.fa --id Simchr22 --read_length 125 --mean_fragment_size 500 --sd_fragment_size 100 --sv_num_ins 0 --sv_num_dup 0 --sv_num_inv 0 --vc_num_del 1000 --sv_num_del 5000 --sv_percent_novel 0.2 --vc_percent_novel 0.01 --vc_min_length_lim 1 --vc_max_length_lim 29 --sv_min_length_lim 30 --sv_max_length_lim 1000000 --nlanes 1 --total_coverage 30 --simulator_executable art_bin_MountRainier/art_illumina --out_dir Varsim_trial/chr22 --log_dir log --simulator art --profile_1 art_bin_MountRainier/Illumina_profiles/HiSeq2500L125R1.txt --profile_2 art_bin_MountRainier/Illumina_profiles/HiSeq2500L125R2.txt --varsim_jar VarSim.jar --java_max_mem 35g

This is the error I get:

22 Oct 2020 15:16:22,315  WARN [main] (VCFparser.java:368): ALT column is malformated: Found illegal character: 46
Offending line: chr1    1167687 rs1085307652    G       .       .       .       RS=1085307652;RSPOS=1167688;dbSNPBuildID=150;SSR=0;SAO=0;VP=0x050060020005000002100200;GENEINFO=B3GALT6:126792|SDF4:51150;WGT=1;VC=DIV;PM;R5;ASP;LSD

So I checked the corresponding lines and it seems that for those variants in the dbSNP VCF, the ALT allele is mentioned as '.' , I am not sure if changing the '.' to '0' or '-' would resolve the error.

Kindly assist for the same. Thanks!

bioinfo89 avatar Oct 22 '20 22:10 bioinfo89