ohell
ohell
Just chiming in to add the suggestion that it would be useful for `bgzip`ped gfa files to be accepted. It is maybe not such a big deal, but still inconvenience...
Oh yes, you're right. The complete message is this: ``` vg autoindex --workflow mpmap --workflow rpvg --gfa /data/ecoli-small.pggb/few-ecoli-genomes.fasta.gz.bf3285f.11fba48.42e55e5.smooth.final.gfa --ref-fasta /data/few-ecoli-genomes.fasta --tx-gff /data/final.gtf --hap-tx-gff /data/final.gtf --prefix /data/ecoli-small --threads 4 [IndexRegistry]: Checking...
Output of both commands (it's the same GFA in both cases, 2nd one looks different 'cos alias for docker run): ``` $ /bin/grep '^H' ecoli-small.pggb/few-ecoli-genomes.fasta.gz.bf3285f.11fba48.42e55e5.smooth.final.gfa | cut -f2 VN:Z:1.0 $...
Output without cutting: ``` $ /bin/grep '^H' ecoli-small.pggb/few-ecoli-genomes.fasta.gz.bf3285f.11fba48.42e55e5.smooth.final.gfa H VN:Z:1.0 ``` I am happy to upload the FASTA if it would be helpful ... though it is just e. coli...
@adamnovak I have put the files in this [shared folder](https://drive.google.com/drive/folders/14Dej9hO3wW0NXpC32_uuoarIYTbIPoPy?usp=drive_link). I have also included the various `ecoli-small.*` files created by the autoindexer before it failed. Thanks again for your help.
> can you adjust your annotations to not re-use the same transcript ID (i.e. unassigned_transcript_198) across multiple haplotypes? Or is that something you think you really ought to be able...
deduplicating transcript IDs before passing to vg is certainly possible. but my point is that almost every user attempting to create a pantranscriptome from a set of assemblies and corresponding...
I wrote a simple awk script to maintain a hash table of seen counts for ids, and append `__count` to the id if seen multiple times. Guess you can follow...
Hello, thanks for the tip. But the command runs if I replace the FASTA with the uncompressed version. Also, I compared the contigs just now, looks OK: ```$ grep "^P"...
> I'm not sure if we can accept gzipped fasta files. Unzipping files is annoying to do in C++; I remember because I tried poking around accepting gzipped GFF3 files...