bgcflow
bgcflow copied to clipboard
Suggestions for documentation
I am creating an issue to list any feedback we get on improving documentation here.
- [x] Provide warning and guide to install gcc. Many machine don't have gcc installed by default, so it will be nice to add a note on this requirement as part of installation instructions. I.e. to try
gcc --version
command before installing BGCFlow - [x] We also need
unzip
for arts post deployment script. Some systems don't have unzip installed by default. - [x] Add information about the funding agency and the affiliation with CFB on the documentation - good example at https://github.com/antismash/antismash
- [ ] Add information on how to check the logs. I think this is quite important to inform readers to look for global logs in
.snakemake/logs/
folder. Whereas, more useful logs with error messages are stored inworkflow/report/logs/rule_name
- [ ] A special information point can be helpful for the GTDB and GTDB-Tk project management. Typically not all genomes will have GTDB information thus users almost always need to create an intermediate PEP to run GTDBTk. Mention that the GTDB-Tk require significant memory and time.
- [ ] Mention that users can update GTDB-Tk release versions from the config file. This information is completely removed from the newest example config templates. Of course using 214 is recommended for future.
Following the Quick Start for installing bgcflow_wrapper, the command 'conda activate bgcflow_wrapper' returns EnvironmentNameNotFound.
Thanks @ljdnielsen for pointing it out :)
Following the quick and easy way for installing bgcflow_wrapper, step 2 sets the channel priorities to flexible. While running the bgcflow with an example dataset, it says exactly the opposite.. Output (partial):
Step 5. Preparing list of final outputs...
- Getting outputs for project: Lactobacillus_delbrueckii
- WARNING: ignoring errors in rule_dictionary
- Ready to generate all outputs.
GTDB API | Grabbing metadata using GTDB release version: r214
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
While running the bgcflow with an example dataset, it takes an extra long time for creating conda environments. The following errors were found after the execution of the run.. 1.
Error in rule ncbi_genome_download:
jobid: 12
output: data/interim/fasta/GCA_000191165.1.fna, data/interim/assembly_report/GCA_000191165.1.txt, data/interim/assembly_report/GCA_000191165.1.json
log: logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log (check log file(s) for error details)
conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/da7e441115fe0f43f2ddb22b63f792d2_
shell:
if [[ GCA_000191165.1 == GCF* ]]
then
source="refseq"
elif [[ GCA_000191165.1 == GCA* ]]
then
source="genbank"
else
echo "accession must start with GCA or GCF" >> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log
fi
ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000191165.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log
gunzip -c data/raw/ncbi/download/$source/bacteria/GCA_000191165.1/*.fna.gz > data/interim/fasta/GCA_000191165.1.fna
cp data/raw/ncbi/download/$source/bacteria/GCA_000191165.1/*report.txt data/interim/assembly_report/GCA_000191165.1.txt
rm -rf data/raw/ncbi/download/$source/bacteria/GCA_000191165.1
python workflow/bgcflow/bgcflow/data/get_assembly_information.py data/interim/assembly_report/GCA_000191165.1.txt data/interim/assembly_report/GCA_000191165.1.json GCA_000191165.1 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
/usr/bin/bash: line 10: 740849 Killed ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000182835.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log
[Wed Oct 11 09:48:11 2023]
Error in rule ncbi_genome_download:
jobid: 10
output: data/interim/fasta/GCA_000182835.1.fna, data/interim/assembly_report/GCA_000182835.1.txt, data/interim/assembly_report/GCA_000182835.1.json
log: logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log (check log file(s) for error details)
conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/da7e441115fe0f43f2ddb22b63f792d2_
shell:
if [[ GCA_000182835.1 == GCF* ]]
then
source="refseq"
elif [[ GCA_000182835.1 == GCA* ]]
then
source="genbank"
else
echo "accession must start with GCA or GCF" >> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log
fi
ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000182835.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log
gunzip -c data/raw/ncbi/download/$source/bacteria/GCA_000182835.1/*.fna.gz > data/interim/fasta/GCA_000182835.1.fna
cp data/raw/ncbi/download/$source/bacteria/GCA_000182835.1/*report.txt data/interim/assembly_report/GCA_000182835.1.txt
rm -rf data/raw/ncbi/download/$source/bacteria/GCA_000182835.1
python workflow/bgcflow/bgcflow/data/get_assembly_information.py data/interim/assembly_report/GCA_000182835.1.txt data/interim/assembly_report/GCA_000182835.1.json GCA_000182835.1 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
/usr/bin/bash: line 10: 741138 Killed ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000056065.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000056065.1.log
Error in rule prokka:
jobid: 36
input: data/interim/fasta/GCA_000014405.1.fna, data/interim/prokka/GCA_000014405.1/organism_info.txt
output: data/interim/prokka/GCA_000014405.1/GCA_000014405.1.gff, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.faa, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.gbk, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.txt, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.tsv, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.fna, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.sqn, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.fsa, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.tbl
log: logs/prokka/prokka/prokka-GCA_000014405.1.log (check log file(s) for error details)
conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/6d99c3821030d48ec757a268edfc89df_
shell:
prokka --outdir data/interim/prokka/GCA_000014405.1 --force --prefix GCA_000014405.1 --genus "`cut -d "," -f 1 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --species "`cut -d "," -f 2 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --strain "`cut -d "," -f 3 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --cdsrnaolap --cpus 4 --increment 10 --evalue 1e-05 data/interim/fasta/GCA_000014405.1.fna &> logs/prokka/prokka/prokka-GCA_000014405.1.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Following the quick and easy way for installing bgcflow_wrapper, step 2 sets the channel priorities to flexible. While running the bgcflow with an example dataset, it says exactly the opposite.. Output (partial):
Step 5. Preparing list of final outputs... - Getting outputs for project: Lactobacillus_delbrueckii - WARNING: ignoring errors in rule_dictionary - Ready to generate all outputs. GTDB API | Grabbing metadata using GTDB release version: r214 Building DAG of jobs... Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Hi, it is normal to get this warning. The reason we don't use strict channel priority right now is because some of the environments are still experimental.
While running the bgcflow with an example dataset, it takes an extra long time for creating conda environments. The following errors were found after the execution of the run.. 1.
Error in rule ncbi_genome_download: jobid: 12 output: data/interim/fasta/GCA_000191165.1.fna, data/interim/assembly_report/GCA_000191165.1.txt, data/interim/assembly_report/GCA_000191165.1.json log: logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log (check log file(s) for error details) conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/da7e441115fe0f43f2ddb22b63f792d2_ shell: if [[ GCA_000191165.1 == GCF* ]] then source="refseq" elif [[ GCA_000191165.1 == GCA* ]] then source="genbank" else echo "accession must start with GCA or GCF" >> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log fi ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000191165.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log gunzip -c data/raw/ncbi/download/$source/bacteria/GCA_000191165.1/*.fna.gz > data/interim/fasta/GCA_000191165.1.fna cp data/raw/ncbi/download/$source/bacteria/GCA_000191165.1/*report.txt data/interim/assembly_report/GCA_000191165.1.txt rm -rf data/raw/ncbi/download/$source/bacteria/GCA_000191165.1 python workflow/bgcflow/bgcflow/data/get_assembly_information.py data/interim/assembly_report/GCA_000191165.1.txt data/interim/assembly_report/GCA_000191165.1.json GCA_000191165.1 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) /usr/bin/bash: line 10: 740849 Killed ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000182835.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log [Wed Oct 11 09:48:11 2023]
Error in rule ncbi_genome_download: jobid: 10 output: data/interim/fasta/GCA_000182835.1.fna, data/interim/assembly_report/GCA_000182835.1.txt, data/interim/assembly_report/GCA_000182835.1.json log: logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log (check log file(s) for error details) conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/da7e441115fe0f43f2ddb22b63f792d2_ shell: if [[ GCA_000182835.1 == GCF* ]] then source="refseq" elif [[ GCA_000182835.1 == GCA* ]] then source="genbank" else echo "accession must start with GCA or GCF" >> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log fi ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000182835.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log gunzip -c data/raw/ncbi/download/$source/bacteria/GCA_000182835.1/*.fna.gz > data/interim/fasta/GCA_000182835.1.fna cp data/raw/ncbi/download/$source/bacteria/GCA_000182835.1/*report.txt data/interim/assembly_report/GCA_000182835.1.txt rm -rf data/raw/ncbi/download/$source/bacteria/GCA_000182835.1 python workflow/bgcflow/bgcflow/data/get_assembly_information.py data/interim/assembly_report/GCA_000182835.1.txt data/interim/assembly_report/GCA_000182835.1.json GCA_000182835.1 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) /usr/bin/bash: line 10: 741138 Killed ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000056065.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000056065.1.log
Error in rule prokka: jobid: 36 input: data/interim/fasta/GCA_000014405.1.fna, data/interim/prokka/GCA_000014405.1/organism_info.txt output: data/interim/prokka/GCA_000014405.1/GCA_000014405.1.gff, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.faa, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.gbk, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.txt, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.tsv, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.fna, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.sqn, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.fsa, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.tbl log: logs/prokka/prokka/prokka-GCA_000014405.1.log (check log file(s) for error details) conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/6d99c3821030d48ec757a268edfc89df_ shell: prokka --outdir data/interim/prokka/GCA_000014405.1 --force --prefix GCA_000014405.1 --genus "`cut -d "," -f 1 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --species "`cut -d "," -f 2 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --strain "`cut -d "," -f 3 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --cdsrnaolap --cpus 4 --increment 10 --evalue 1e-05 data/interim/fasta/GCA_000014405.1.fna &> logs/prokka/prokka/prokka-GCA_000014405.1.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Yes, some of the environment are big so it take some time to download. Nevertheless, the environment set up only happen once or when there is an update on the environment configuration. Are you using conda or mamba in your machine? Switching to mamba is preferred and usually it is much faster.
We are planning to containerize the environments in the future, but at the moment we don't have enough time and resources to set it up yet. Once the docker containers are in place, it should be way faster to deploy BGCFlow.
While running the bgcflow with an example dataset, it takes an extra long time for creating conda environments. The following errors were found after the execution of the run.. 1.
Error in rule ncbi_genome_download: jobid: 12 output: data/interim/fasta/GCA_000191165.1.fna, data/interim/assembly_report/GCA_000191165.1.txt, data/interim/assembly_report/GCA_000191165.1.json log: logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log (check log file(s) for error details) conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/da7e441115fe0f43f2ddb22b63f792d2_ shell: if [[ GCA_000191165.1 == GCF* ]] then source="refseq" elif [[ GCA_000191165.1 == GCA* ]] then source="genbank" else echo "accession must start with GCA or GCF" >> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log fi ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000191165.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log gunzip -c data/raw/ncbi/download/$source/bacteria/GCA_000191165.1/*.fna.gz > data/interim/fasta/GCA_000191165.1.fna cp data/raw/ncbi/download/$source/bacteria/GCA_000191165.1/*report.txt data/interim/assembly_report/GCA_000191165.1.txt rm -rf data/raw/ncbi/download/$source/bacteria/GCA_000191165.1 python workflow/bgcflow/bgcflow/data/get_assembly_information.py data/interim/assembly_report/GCA_000191165.1.txt data/interim/assembly_report/GCA_000191165.1.json GCA_000191165.1 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000191165.1.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) /usr/bin/bash: line 10: 740849 Killed ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000182835.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log [Wed Oct 11 09:48:11 2023]
Error in rule ncbi_genome_download: jobid: 10 output: data/interim/fasta/GCA_000182835.1.fna, data/interim/assembly_report/GCA_000182835.1.txt, data/interim/assembly_report/GCA_000182835.1.json log: logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log (check log file(s) for error details) conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/da7e441115fe0f43f2ddb22b63f792d2_ shell: if [[ GCA_000182835.1 == GCF* ]] then source="refseq" elif [[ GCA_000182835.1 == GCA* ]] then source="genbank" else echo "accession must start with GCA or GCF" >> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log fi ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000182835.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log gunzip -c data/raw/ncbi/download/$source/bacteria/GCA_000182835.1/*.fna.gz > data/interim/fasta/GCA_000182835.1.fna cp data/raw/ncbi/download/$source/bacteria/GCA_000182835.1/*report.txt data/interim/assembly_report/GCA_000182835.1.txt rm -rf data/raw/ncbi/download/$source/bacteria/GCA_000182835.1 python workflow/bgcflow/bgcflow/data/get_assembly_information.py data/interim/assembly_report/GCA_000182835.1.txt data/interim/assembly_report/GCA_000182835.1.json GCA_000182835.1 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000182835.1.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) /usr/bin/bash: line 10: 741138 Killed ncbi-genome-download -s $source -F fasta,assembly-report -A GCA_000056065.1 -o data/raw/ncbi/download -P -N --verbose bacteria 2>> logs/ncbi/ncbi_genome_download/ncbi_genome_download_GCA_000056065.1.log
Error in rule prokka: jobid: 36 input: data/interim/fasta/GCA_000014405.1.fna, data/interim/prokka/GCA_000014405.1/organism_info.txt output: data/interim/prokka/GCA_000014405.1/GCA_000014405.1.gff, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.faa, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.gbk, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.txt, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.tsv, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.fna, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.sqn, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.fsa, data/interim/prokka/GCA_000014405.1/GCA_000014405.1.tbl log: logs/prokka/prokka/prokka-GCA_000014405.1.log (check log file(s) for error details) conda-env: /home/azureuser/datadrive/bgc_try1/.snakemake/conda/6d99c3821030d48ec757a268edfc89df_ shell: prokka --outdir data/interim/prokka/GCA_000014405.1 --force --prefix GCA_000014405.1 --genus "`cut -d "," -f 1 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --species "`cut -d "," -f 2 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --strain "`cut -d "," -f 3 data/interim/prokka/GCA_000014405.1/organism_info.txt`" --cdsrnaolap --cpus 4 --increment 10 --evalue 1e-05 data/interim/fasta/GCA_000014405.1.fna &> logs/prokka/prokka/prokka-GCA_000014405.1.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
For the error, can you also check what is being written in the log files? For example: logs/prokka/prokka/prokka-GCA_000014405.1.log
.
Looking at the error message, it says the job is killed. This sounds more like a resource problem, either internet connection, using more threads than the computer has, or running out of memory. Can you tell me more about the system you are running BGCFlow from?
I am using VM with 2 CPUs and 8GB memory.
I am using VM with 2 CPUs and 8GB memory.
You might need to increase the memory as some of the tools are demanding. Also, limit the cpu accordingly:
bgcflow run -c 2
I am using VM with 2 CPUs and 8GB memory.
You might need to increase the memory as some of the tools are demanding. Also, limit the cpu accordingly:
bgcflow run -c 2
Ok, I will check it. Thanks.
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
Please find the prokka log file here, prokka_log.log
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
Please find the prokka log file here, prokka_log.log
Hmm, seems like some perl dependencies are missing from linux? I didn't have this issue on a freshly installed vm, but I do update the Linux vm before setting up gcc. So maybe do an update as mentioned here: https://github.com/NBChub/bgcflow/wiki/00-Installation-Guide#gcc-compiler:
sudo apt update
sudo apt-get install build-essential
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
Please find the prokka log file here, prokka_log.log
Hmm, seems like some perl dependencies are missing from linux? I didn't have this issue on a freshly installed vm, but I do update the Linux vm before setting up gcc. So maybe do an update as mentioned here: https://github.com/NBChub/bgcflow/wiki/00-Installation-Guide#gcc-compiler:
sudo apt update sudo apt-get install build-essential
Thanks, I will update as mentioned and will run and check workflow again.
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
Please find the prokka log file here, prokka_log.log
Hmm, seems like some perl dependencies are missing from linux? I didn't have this issue on a freshly installed vm, but I do update the Linux vm before setting up gcc. So maybe do an update as mentioned here: https://github.com/NBChub/bgcflow/wiki/00-Installation-Guide#gcc-compiler:
sudo apt update sudo apt-get install build-essential
2023-10-16T073106.185559.snakemake.log .txt
Thanks, I will update as mentioned and will run and check workflow again.
I ran the workflow again as per your suggestions. Still the problem persists. I have attached the log file for your reference.
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
Please find the prokka log file here, prokka_log.log
Hmm, seems like some perl dependencies are missing from linux? I didn't have this issue on a freshly installed vm, but I do update the Linux vm before setting up gcc. So maybe do an update as mentioned here: https://github.com/NBChub/bgcflow/wiki/00-Installation-Guide#gcc-compiler:
sudo apt update sudo apt-get install build-essential
2023-10-16T073106.185559.snakemake.log .txt
Thanks, I will update as mentioned and will run and check workflow again.
I ran the workflow again as per your suggestions. Still the problem persists. I have attached the log file for your reference.
Hi, can you try to fetch the latest main branch? I simplified the prokka env, wonder if that solves the problem.
git fetch
git status
git pull
I'm meeting Binhuan today at 4 for troubleshooting. If you are at Biosustain, you're welcome to join. You can also send me a message on Teams
I used a system with 8 CPUs and 64GB memory. Still, the problem persists. I am attaching the log file here. log_file.log
Can you attach this log file?
logs/prokka/prokka/prokka-GCA_000056065.1.log
Please find the prokka log file here, prokka_log.log
Hmm, seems like some perl dependencies are missing from linux? I didn't have this issue on a freshly installed vm, but I do update the Linux vm before setting up gcc. So maybe do an update as mentioned here: https://github.com/NBChub/bgcflow/wiki/00-Installation-Guide#gcc-compiler:
sudo apt update sudo apt-get install build-essential
2023-10-16T073106.185559.snakemake.log .txt
Thanks, I will update as mentioned and will run and check workflow again.
I ran the workflow again as per your suggestions. Still the problem persists. I have attached the log file for your reference.
Hi, can you try to fetch the latest main branch? I simplified the prokka env, wonder if that solves the problem.
git fetch git status git pull
I'm meeting Binhuan today at 4 for troubleshooting. If you are at Biosustain, you're welcome to join. You can also send me a message on Teams I will check and will come back to you. If it´s possible to join online, I would be happy to join and discuss.