docker-builds icon indicating copy to clipboard operation
docker-builds copied to clipboard

[Request An Update]: Update prokka 1.14.5 container

Open SESchmedes opened this issue 2 years ago • 2 comments

What container needs an update?

There is a bug in prokka that causes random failures at the blastp step. Potentially need to update container to use pre-built blast binaries from NCBI.

(example partial output from a failed run)

[14:45:31] Will use blast to search against /prokka-1.14.5/db/kingdom/Bacteria/IS with 8 CPUs [14:45:31] Running: cat /blue/bphl-florida/thsalikilakshmi/data/jbi/20220824_jax_220629_PLN_WLK_MS/2022-08-24_flaq_run/JBI22000480/JBI22000480_assembly/prokka/JBI22000480.IS.tmp.43213.faa | parallel --gnu --plain -j 8 --block 131976 --recstart '>' --pipe blastp -query - -db /prokka-1.14.5/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > /blue/bphl-florida/thsalikilakshmi/data/jbi/20220824_jax_220629_PLN_WLK_MS/2022-08-24_flaq_run/JBI22000480/JBI22000480_assembly/prokka/JBI22000480.IS.tmp.43213.blast 2> /dev/null [14:45:31] Could not run command: cat /blue/bphl-florida/thsalikilakshmi/data/jbi/20220824_jax_220629_PLN_WLK_MS/2022-08-24_flaq_run/JBI22000480/JBI22000480_assembly/prokka/JBI22000480.IS.tmp.43213.faa | parallel --gnu --plain -j 8 --block 131976 --recstart '>' --pipe blastp -query - -db /prokka-1.14.5/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > /blue/bphl-florida/thsalikilakshmi/data/jbi/20220824_jax_220629_PLN_WLK_MS/2022-08-24_flaq_run/JBI22000480/JBI22000480_assembly/prokka/JBI22000480.IS.tmp.43213.blast 2> /dev/null

We will run prokka as a singularity container (pulled from StaphB docker builds) in pipelines that will work for months and then randomly won't work and then will go back to working. It is completely random. We have tested out on different HPC node partitions, etc. We can't explain it.

This is our prokka command we use in our pipeline: singularity exec --bind sample_dir:/data --pwd /data --cleanenv /apps/staphb-toolkit/containers/prokka_1.14.5.sif prokka --cpus --genus --species --strain --outdir sample_dir/prokka --prefix --force --compliant --locustag sample_dir/fasta

There have been multiple issues opened at https://github.com/tseemann/prokka/issues related to this same issue at the blastp step. Per previous discussions on StaphB Slack, this may be an issue with the blast binaries.

We previously discussed trying the idea of adjusting the prokka container to use a pre-built blast binaries directly from NCBI instead of the one that is shipped with the prokka code.

SESchmedes avatar Aug 26 '22 15:08 SESchmedes

Thanks for submitting the issue.

I have one ask, totally understand if not possible due to the HPC setup, but could you re-run that last command, but without the 2>/dev/null that's at the end of the command? This way we can see the more explicit and helpful error message from the blastp command.

So run:

cat /blue/bphl-florida/thsalikilakshmi/data/jbi/20220824_jax_220629_PLN_WLK_MS/2022-08-24_flaq_run/JBI22000480/JBI22000480_assembly/prokka/JBI22000480.IS.tmp.43213.faa | parallel --gnu --plain -j 8 --block 131976 --recstart '>' --pipe blastp -query - -db /prokka-1.14.5/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > /blue/bphl-florida/thsalikilakshmi/data/jbi/20220824_jax_220629_PLN_WLK_MS/2022-08-24_flaq_run/JBI22000480/JBI22000480_assembly/prokka/JBI22000480.IS.tmp.43213.blast

Either way, we can work on providing a docker image for prokka v1.14.6 and use blast binaries directly from NCBI instead of the ones shipped with prokka code.

kapsakcj avatar Aug 26 '22 16:08 kapsakcj

I ran prokka interactively in the container but I did not get a failed run. It worked fine. It's hard to reproduce bc it really is random.

I did run that specific blastp command (see below) on a final .faa file just to see what the output is if I removed 2>/dev/null and got the following output:

Singularity> cat JBI22000715.faa | parallel --gnu --plain -j 8 --block 84409 --recstart '>' --pipe blastp -query - -db /prokka-1.14.5/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > prokka\/JBI22000715\.IS\.tmp\.49688\.blast

/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found
/bin/bash: blastp: command not found

SESchmedes avatar Aug 26 '22 17:08 SESchmedes

Closed via PR #446

kapsakcj avatar Nov 30 '22 23:11 kapsakcj