vg
vg copied to clipboard
VG Construct fails: alignment does not start with match over padded sequence
I have constructed graphs previously without encountering this issue. Any ideas or solutions? Thanks in advance.
My command looks like this:
vg construct -r ref.fna -v sample.vcf.gz > sample.vg
And this is the full error:
warning:[vg::Constructor] Lowercase characters found in NC_004354.4; coercing to uppercase. parsedAlternates: alignment does not start with match over padded sequence 15M4I9M1S ZZZZZZZZZZQTTTZZZZZZZZZZ ZZZZZZZZZZQNON_REF>ZZZZZZZZZZ
It looks like your VCF has the string NON_REF
(maybe <NON_REF>
?) somewhere in an alt. We can't deal with that variant; we can only deal with variants that actually specify the alternate allele fully.
Try dropping the variant with something like:
zcat sample.vcf.gz | grep -v "NON_REF" > sample.clean.vcf
bgzip sample.clean.vcf
tabix -p vcf sample.clean.vcf.gz
We really ought to produce a better error message than this when we encounter this situation, though. The text of that symbolic alt is getting into the gears of the vcflib align-the-alts-to-the-ref code and exploding, instead of being caught earlier.
Hi @adamnovak
I got a similar error message but I don't have NON_REF in my error message ( I skipped some points since it was too long)
Restricting to chr22 from 1 to end
building graph for chr22
parsedAlternates: alignment does not start with match over padded sequence
71400S
ZZZ...(skip)...ZZQZZ...(skip)...ZZZ
ZZZ...(skip)...ZZQTGG...(skip)...CAGZZZ...(skip)...ZZZ
I also checked that there were only ATGC in the ALT sequence region by the following command
cat error_message | grep [ATGC]
This is a horrible hack in vcflib, where I'm trying to fake a global alignment using these weird characters in the allele strings.
We should change this to use one of the alignment methods in vg. One ideal one would be the banded global aligner, or maybe xdrop. Either would be better than this hack.
You can avoid this by setting --flat-alts in vg construct.
Thank you for kind reply, Erik!
This error was gone after applying --flat-alts
We are still working on this. An update to vcflib will eliminate this error.
On Wed, Apr 27, 2022, 04:50 Cade Mirchandani @.***> wrote:
Closed #2208 https://github.com/vgteam/vg/issues/2208.
— Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/2208#event-6503330631, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEL75LKDQWOAFX7FZ2TVHCTP7ANCNFSM4HFBKIFA . You are receiving this because you commented.Message ID: @.***>