lithops
lithops copied to clipboard
AWS Lambda invoker's performance depends on the Python interpreter
I've noticed an issue with the performance of invocation of AWS Lambda functions. Depending on the python interpreter used, the performance of the invocation of cloud functions changes.
For example, when using the Python 3.10 interpreter of VM in AWS EC2 with Ubuntu 22.04, some AWS Lambda functions start is delayed between 5 and 10 seconds. As can be seen in this plot:
But using the same Python version (3.10.12) from Conda in the same VM, same OS and same AWS account I obtained a much better performance:
Despite the performance improvement when using Conda, there are still almost 50% of functions that take 1 second longer to start, even when in a warmed-up state (see the two last map stages from the previous plot). This behavior is the same for Python 3.8, 3.9, 3.10 and 3.11.
Click to see: Python 3.8 plot (using conda)
Python 3.9 plot (using conda)
Python 3.10 plot (using conda)
Python 3.11 plot (using conda)
But with Python 3.7 the performance is what one would expect to be (almost perfect):
All this previous plots have been generated doing 3 maps of 100 functions that sleep for 5 seconds. This has been executed from a t2.large
VM with Ubuntu 22.04 in us-east-1
, with all the Lithops default configurations except for the invoke_pool_threads
that was set to 128
. I have also used the same VM with Amazon Linux 2023 OS and the results are similar to the previous ones using the Conda interpreter (I could upload the plots if requested). I've used the current master branch of Lithops to do this test, but the issue can be reproduced using versions 3.0.0, 3.0.1, 2.9, and also 2.7.1.
Here is the code used:
import time
import lithops
def count_cold_starts(futures):
cold = 0
warm = 0
for future in futures:
stats = future.stats
if stats['worker_cold_start']:
cold += 1
else:
warm += 1
return cold, warm
futures = []
fexec = lithops.FunctionExecutor()
for _ in range(3):
num_fun = 100
def my_sleep(x):
time.sleep(x)
return num_fun
f = fexec.map(my_sleep, [5 for _ in range(num_fun)])
fexec.get_result()
futures.append(f)
cold, warm = count_cold_starts(f)
print(f"cold: {cold}, warm: {warm}")
fexec.plot()