gapseq
gapseq copied to clipboard
[Question] Can you give it predetermined genes/proteins?
For example if you have eukaryotic genomes with eukaryotic gene calls, can you use these instead of giving full genomes? If not, can this be an option in the future?
Hi! Unfortunately, this is not possible yet, but we are currently working on this option. This will be part of a major update to improve performance (i.e. reducing runtime). I'll tag this issue when we add the option to the main branch.
Is this feature available in the current version?
Not yet. We have an experimental branch, where we are currently running tests with this options. We'll post updates here or in the duplicate issue #64 .
With the latest gapseq version, you can now use a multi-protein fasta file as genome input.
Here's an example workflow:
# Reaction+Pathway prediction
gapseq find -p all -t Bacteria -m Bacteria -b 200 e_lenta.faa.gz
# Transporter prediction
gapseq find-transport -b 200 e_lenta.faa.gz
# Draft network reconstruction
gapseq draft -r e_lenta-all-Reactions.tbl -t e_lenta-Transporter.tbl -b pos -u 200 -l 100 -p e_lenta-all-Pathways.tbl
# Gapfill-medium prediction
gapseq medium -m e_lenta-draft.RDS -p e_lenta-all-Pathways.tbl -c "cpd00007:0"
# Gapfilling
gapseq fill -m e_lenta-draft.RDS -c e_lenta-rxnWeights.RDS -g e_lenta-rxnXgenes.RDS -b 100 -n e_lenta-medium.csv
Please note: When providing a protein fasta, you will need to set the taxonomic domain manually using the -t
option. Similarly, you also need to specific, which biomass reaction is added in the draft reconstruction step using the option -b
.
Looking forward to it. Once you get this sorted out I'm going to add it into my metagenomics pipeline as a separate module.
Does this also work for microeukaryotes like diatoms and fungi?