CTranslate2
CTranslate2 copied to clipboard
ctranslate2 crashes on AWS lambda
Hello,
CTranslate2 crashes with the following error when started inside of an AWS Lambda:
OMP: Error #179: Function Can't open SHM2 failed:
OMP: System error #2: No such file or directory
The problem happens when a new Translator object gets created translator = ctranslate2.Translator(model_path, device="cpu")
This can be reproduced inside of a docker container without AWS, if you start the container like this docker run --ipc=none ...
I researched about this problem online, and the issue is that lambda does not provide /dev/shm. Pupeeter has the same problem, there they provide a flag to turn off /dev/shm usage, see https://github.com/puppeteer/puppeteer/blob/v1.0.0/docs/troubleshooting.md#tips. Firefox selenium has the same issue and it seems like no one has found a way to run a headless firefox inside of a lambda, they don't provide such a flag.
Is there a way CTranslate2 could stop using /dev/shm, maybe based on an ENV variable or so?
Thank you so much already
Hi,
The error comes from OpenMP (OMP) which is used for multithreading. It was also reported a few months ago on the forurm. At that time I simply suggested to use EC2 instances instead of Lambdas.
This can be reproduced inside of a docker container without AWS, if you start the container like this docker run --ipc=none ...
Thank you for the tip. Indeed /dev/shm is no longer present when using this flag but I did not manage to reproduce the error. Can you share a full example to reproduce the error with Docker?
Hi @guillaumekln,
Thank you for your quick reply. I created a repository to easier reproduce the bug: ctranslate_dev_shm_issue
Unfortunately we can't use EC2 at our company, we are supposed to use lambda.
Thanks for the reproducer. I was using Ubuntu 20.04 as the base image and did not reproduce the error (probably for the same reason you did not reproduce with Debian).
We are currently building with the LLVM OpenMP runtime from Intel. This error happens early when the runtime is initialized and I did not find a way to change this behavior (see https://github.com/llvm/llvm-project/issues/53955).
However, using the GNU OpenMP runtime seems to fix the error. This is probably the only reasonable solution to this issue and I will consider this change for the next version.
However, using the GNU OpenMP runtime seems to fix the error. This is probably the only reasonable solution to this issue and I will consider this change for the next version.
The version 2.24.0 includes this change.
@Pita, could you verify that you can now run on AWS Lambda?
Hello @guillaumekln,
I just tested it again with the latest version and I can confirm it works now. Thank you so much for this fast fix!