blast_plus_docs
blast_plus_docs copied to clipboard
blastn 2.11.0 in Docker hangs phoning home
My apologies if this is not the correct place to report this, but I would expect the issue to show up in this project too.
I am running blastn
in a docker container, copying it in from the binary tarball from https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+.
Since upgrading to 2.11.0, blastn
calls take excessively long to finish, or do not complete at all.
Setting BLAST_USAGE_REPORT=false
resolves the issue. This raises the strong suspicion that the new usage reporting feature is the culprit.
The issue can be reproduced by creating a docker image with a simple Dockerfile:
FROM ubuntu # or your preferred starting image
COPY blastn /usr/local/bin
USER nobody:nogroup
After building the container with docker build -t test-bug "."
, observe the difference between:
docker run -ti --rm --read-only -e BLAST_USAGE_REPORT=false test-bug blastn -help
and
docker run -ti --rm --read-only test-bug blastn -help
The hiccup is sub-second but already noticeable. Start a longer running local blastn
call, and runtimes of normally e.g. 15s go up to many minutes, while top
shows the processes as mostly sleeping.
Update for the record: the excessively long run times were not for single runs of blastn
. They happened in our pipeline where we do a few dozen calls in series. These should take ~20s altogether, but their added up "hang time" made the job timeout after 20mn.
Hello, Thank you for your report. We will try to reproduce this issue. From your description, I assume you are not on a cloud provider but rather on your own hardware.
Tom
@zwets - I was not able to reproduce this issue using the GCP Cloud Shell as described in this tutorial. After initiating the Cloud Shell, I was able to run the following commands successfully -
docker run --rm -e BLAST_USAGE_REPORT=false ncbi/blast blastn -help
docker run --rm ncbi/blast blastn -help
If you simply copy the executable to the Docker image, there may be missing dependencies or cause other issues. I would 1) use the official NCBI BLAST image or 2) ftp the entire tar ball into the Docker container and unzip/build inside the container. Hope this helps.
Thank you for getting back on this. I am running this on a laptop, so the issue really is with blastn
rather than the BLAST docker image.
I wasn't initially able to reproduce the issue, until we had a network interruption (this is Africa). Sure enough: blastn -help
under docker took over 20s. Running it straight on Linux wasn't quite as bad, but still close to 6s. With BLAST_USAGE_REPORT=false
this goes down to the expected 0.00s.
I am currently pulling the ncbi/blast
container to see if it has the same issue (but at 1GB that takes a while here). FTR, while it is pulling, blastn -help
takes ~7s both inside and outside the container ...
Clearly, the default "on" setting for the usage reporting isn't great for parts of the world outside of the well-connected north, or indeed anyone running blastn
on a disconnected machine, especially in docker.
I will report on what happens with the ncbi/blast
image when I have it on my machine.
thanks for the additional information. I had tried it on my laptop with the wifi turned off (months ago, pre-release) and didn't see a problem, but I didn't try it on a slower network (or one half a world away). I'll try the latest version on my laptop (with wifi off) again to see what happens in case something changed. I don't think that docker will be better since it just wraps the BLAST+ executables. I'll speak to our developers about whether we can do better in upcoming releases.
Hi Tom, here are the results for the ncbi/blast
image on my laptop - and they don't look great :-/
$ time docker run --rm -e BLAST_USAGE_REPORT=false ncbi/blast blastn -help >/dev/null
real 0m0.870s
# With 'normal' network (~4Mbps, but DNS has high latency here)
$ time docker run --rm ncbi/blast blastn -help >/dev/null
real 0m5.600s
# With network disconnected ... oops!
$ time docker run --rm ncbi/blast blastn -help >/dev/null
real 1m0.914s
I reran the last one a few times. It is close to a 1 minute wait every time.
thanks. We'll need to look at that.
I suppose under docker the issue is worse because the containers sit on their own network, and won't see their link go down when the outside link is down (unless Docker would do this).
Add to that that resolving www.ncbi.nlm.nih.gov is very slow here (and occasionally fails with timeout), apparently due to the number of recursions and high latency. And with a TTL of only 30s it won't be cached, so this adds a few seconds to every blastn call.
Anyway, thanks for looking into this!