elasticsearch-py
elasticsearch-py copied to clipboard
Function not found EXCEPTION: [Errno 38] when using helpers.parallel_bulk in aws lambda
Describe the feature:
Elasticsearch version (bin/elasticsearch --version):
elasticsearch-py version (elasticsearch.__versionstr__):
elasticsearch==7.19.9
Please make sure the major version matches the Elasticsearch server you are running.
Description of the problem including expected versus actual behavior:
We were using elasticsearch helper parallel_bulk inside of a AWS lambda function. We were running with the same version of elasticsearch, but on python runtime 3.7.
Now that we upgraded to python runtime 3.11, I get this error when it tries to execute parallel_bulk:
EXCEPTION: [Errno 38] Function not implemented
It might be because lambda doesn't allow for the use of some of the python multiprocessing pacakge.
- https://stackoverflow.com/questions/34005930/multiprocessing-semlock-is-not-implemented-when-running-on-aws-lambda
- https://aws.amazon.com/blogs/compute/parallel-processing-in-python-with-aws-lambda/
- https://stackoverflow.com/questions/60816172/numba-issues-multiprocessing-userwarning-when-running-in-aws-lambda
Steps to reproduce:
Try uploading document to elasticsearch from AWS lambda python runtime 3.11 using the parallel_bulk helper
Thanks for the report. We may want to try to make this work on AWS Lambda, but I'm confused, how could this work on Python 3.7 since multiprocessing.pool.ThreadPool does not work on AWS Lambda?
Also, can you please share the full exception/traceback?
Closing, but I'll reopen if I get more details. Thank you!
Got another report that this indeed fails starting with Python 3.8, and the links above give possible workarounds to support AWS Lambda. I also now understand that the reason it works on Python 3.7: Python 3.8 and above use SemLock which isn't supported by AWS Lambda.
We still want to use the faster ThreadPool when possible, but fallback to Pipe when it's not available.
This is very unlikely to be backported to elasticsearch-py 7.x, but should be available in a later elasticsearch-py 8.x version. (The migration path is easier now, with changes to the body parameter that went in 8.12.)