lithops icon indicating copy to clipboard operation
lithops copied to clipboard

AWS Lambda invoker's performance depends on the Python interpreter

Open gfinol opened this issue 6 months ago • 10 comments

I've noticed an issue with the performance of invocation of AWS Lambda functions. Depending on the python interpreter used, the performance of the invocation of cloud functions changes.

For example, when using the Python 3.10 interpreter of VM in AWS EC2 with Ubuntu 22.04, some AWS Lambda functions start is delayed between 5 and 10 seconds. As can be seen in this plot:

python31012-system1702566715_timeline

But using the same Python version (3.10.12) from Conda in the same VM, same OS and same AWS account I obtained a much better performance: python31012-conda1702567642_timeline

Despite the performance improvement when using Conda, there are still almost 50% of functions that take 1 second longer to start, even when in a warmed-up state (see the two last map stages from the previous plot). This behavior is the same for Python 3.8, 3.9, 3.10 and 3.11.

Click to see: Python 3.8 plot (using conda)

python38-conda1702566938_timeline

Python 3.9 plot (using conda)

python39-conda1702567023_timeline

Python 3.10 plot (using conda)

python31013-conda1702567068_timeline

Python 3.11 plot (using conda)

python311-conda1702567151_timeline

But with Python 3.7 the performance is what one would expect to be (almost perfect): python37-conda1702566853_timeline

All this previous plots have been generated doing 3 maps of 100 functions that sleep for 5 seconds. This has been executed from a t2.large VM with Ubuntu 22.04 in us-east-1, with all the Lithops default configurations except for the invoke_pool_threads that was set to 128. I have also used the same VM with Amazon Linux 2023 OS and the results are similar to the previous ones using the Conda interpreter (I could upload the plots if requested). I've used the current master branch of Lithops to do this test, but the issue can be reproduced using versions 3.0.0, 3.0.1, 2.9, and also 2.7.1.

Here is the code used:

import time
import lithops

def count_cold_starts(futures):
    cold = 0
    warm = 0
    for future in futures:
        stats = future.stats
        if stats['worker_cold_start']:
            cold += 1
        else:
            warm += 1
    return cold, warm

futures = []
fexec = lithops.FunctionExecutor()
for _ in range(3):
    num_fun = 100

    def my_sleep(x):
        time.sleep(x)
        return num_fun

    f = fexec.map(my_sleep, [5 for _ in range(num_fun)])
    fexec.get_result()
    futures.append(f)

    cold, warm = count_cold_starts(f)

    print(f"cold: {cold}, warm: {warm}")

fexec.plot()

gfinol avatar Dec 15 '23 09:12 gfinol