ncbi-acc-download
ncbi-acc-download copied to clipboard
running with `--format fasta` creates an empty fa file
(This was already mentioned in https://github.com/kblin/ncbi-acc-download/issues/13#issuecomment-531677362, but I think it is better to have a separate issue.)
ncbi-acc-download --format fasta --recursive --verbose AAXATB000000000.1
creates an empty fasta file, while
ncbi-acc-download --format genbank --recursive --verbose AAXATB000000000.1
creates a genbank file as expected.
The same thing happened when I tried on ACIN00000000.3
.
IIUC, an easy fix is to implement --format fasta
by using --format genbank
(the default) and then using SeqIO.convert
(from Biopython), e.g.:
SeqIO.convert('AAXATB000000000.1.gbk', 'genbank', 'AAXATB000000000.1.gbk.fasta', 'fasta')
The NCBI Entrez API does deliver FASTA files, just not if you query for WGS master entries.
I don't really want to depend on Biopython for all of ncbi-genome-download, but arguably we could go that path if --recursive
is specified, as we depend on Biopython for --recursive
anyway.