spacegraphcats icon indicating copy to clipboard operation
spacegraphcats copied to clipboard

can we convert the cDBG to GFA format with the current output files from spacegraphcats?

Open taylorreiter opened this issue 3 years ago • 2 comments

I'm wondering if we can convert the current bcalm output by spacegraphcats to gfa. These are the current output files in *_k31:

bcalm.inputlist.txt  bcalm.unitigs.db      cdbg.gxt          contigs.mphf   reads.bgz.index
bcalm.log.txt        bcalm.unitigs.fa.sig  contigs.indices   contigs.sig
bcalm_to_gxt.log     bcalm.unitigs.pickle  contigs.info.csv  contigs.sizes

I used to do the conversion from the *unitigs.fa file but i think that became a temp() file that gets deleted.

taylorreiter avatar Jul 12 '22 17:07 taylorreiter

yes, it's a temp file - https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/conf/Snakefile#L168

and the header information is neither preserved nor output-table per https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/dump_contigs_db_to_fasta.py

so, we can't convert the current output, no :(.

might be something to consider doing as part of https://github.com/spacegraphcats/spacegraphcats/pull/430 tho.

ctb avatar Jul 14 '22 13:07 ctb

ok great, thanks. I'm increasingly thinking of ways to better integrate with other people interested in metagenome assembly graphs, and I think gfa is a pretty standard format that wouldn't be a bad idea to either output or have a simple conversion script for. #430 looks exciting!

taylorreiter avatar Jul 14 '22 13:07 taylorreiter