asgiref
asgiref copied to clipboard
Keep thread pool executors around.
I'd love to hear your thoughts on thise one @carltongibson and @andrewgodwin. It just seems so wasteful to recreate new threads all the time when we can just reuse them.
Given this obviously somewhat artificial change, the timings go down from 15 seconds to 8 seconds:
#!/usr/bin/env python3
import asyncio
from asgiref.sync import ThreadSensitiveContext, sync_to_async
def sync():
pass
async def test():
for i in range(100_000):
async with ThreadSensitiveContext():
await sync_to_async(sync)()
async def main():
await asyncio.gather(test())
if __name__ == "__main__":
import time
s = time.perf_counter()
asyncio.run(test())
elapsed = time.perf_counter() - s
print(f"{__file__} executed in {elapsed:0.2f} seconds.")
Completely unscientifically this means that the time to create and destroy 100000 threads is roughly 7 seconds on my machine. I am not really sure what a good thread count would be, it might make sense to copy the default max_workers
from the thread pool executor.
The irony of pooling a threadpool is not lost on me, but if it provides significant enough performance benefits, maybe it's worth it. Any idea what % saving this is on a more traditional use case?
Most likely negligible (when looking at the "raw" numbers -- 7 seconds for 100000 threads is way below a millisecond if I did math correctly ;)). That said I do not know how much thread state python/django can accumulate that would require cleaning up and could increase that number. There is also the question on how well threads perform on other systems; I do not know much about windows or mac in that regard.
What it would allow for (in theory -- currently breaks because django.db.connection is context aware and not just a threading local) is that you could run with a pool of lets say 10 threads and reuse connections (ie run with persistent connections).
Hm, yeah, the extra complexity for such a small performance gain does give me a little pause. If there's another convincing reason to add it, like the connections, that might change things though - but my understanding of the problem with connection management is that the same request needs to be on the same thread/transaction, and that's harder.