asgiref
asgiref copied to clipboard
Discussion: Possible minor performance improvement in Local._get_context_id
This is rudimentary synthetic benchmarking and investigation. Caveat emptor, and apologies if this is wrong and a waste of your time reviewing it.
Whilst looking at the yappi/cProfile output for a couple of tickets in Django proper, I stumbled on a small amount of time being spent in Local._get_context_id()
(because of things like i18n machinery). Using line_profiler
and the %lprun
ipython magic it seems like 20% of time is being spent on:
Line # Hits Time Per Hit % Time Line Contents
==============================================================
51 100001 222250.0 2.2 20.1 from .sync import AsyncToSync, SyncToAsync
which is notably and understandably to Prevent a circular reference
Using the following ad-hoc test:
from asgiref.local import Local
x = Local()
def timeonly():
setattr(x, 'test', None)
for _ in range(100_000):
getattr(x, 'test')
if __name__ == '__main__':
timeonly()
in ipython:
In [1]: from local_test import timeonly
In [2]: %timeit timeonly()
418 ms ± 2.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [3]: %timeit timeonly()
418 ms ± 4.86 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [4]: %timeit -n100 timeonly()
511 ms ± 29.9 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
The change I'm asking you to consider (given any issues around thread-safety or monkeypatched modules which I can't begin to fathom) is:
modules = sys.modules
if 'asgiref.sync' in modules:
sync = modules['asgiref.sync']
AsyncToSync, SyncToAsync = sync.AsyncToSync, sync.SyncToAsync
else:
from .sync import AsyncToSync, SyncToAsync
or some equivalent thereof which meets your preferred style.
Notably it doesn't alter the percentage of time spent, according to line profiler (ie it's still basically 20%
):
Line # Hits Time Per Hit % Time Line Contents
==============================================================
51 100001 48007.0 0.5 4.5 modules = sys.modules
52 100001 48612.0 0.5 4.6 if 'asgiref.sync' in modules:
53 100000 49366.0 0.5 4.6 sync = modules['asgiref.sync']
54 100000 52234.0 0.5 4.9 AsyncToSync, SyncToAsync = sync.AsyncToSync, sync.SyncToAsync
55 else:
56 1 4183.0 4183.0 0.4 from .sync import AsyncToSync, SyncToAsync
but does appear to be quicker overall:
In [1]: from local_test import timeonly
In [2]: %timeit timeonly()
319 ms ± 1.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [3]: %timeit timeonly()
322 ms ± 4.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [4]: %timeit -n100 timeonly()
378 ms ± 16.8 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
As far as I can tell from running the ad-hoc test (above) through yappi the saving is skipping the parts of the importer which ultimately do something with a ModuleSpec.parent
(amounting to 6%-8% of time, according to pycharm's pstats visualiser):
ncalls tottime percall cumtime percall filename:lineno(function
100141 0.377 0.000 0.556 0.000 <frozen importlib._bootstrap>:398(ModuleSpec.parent)
vs (after proposed patch):
ncalls tottime percall cumtime percall filename:lineno(function)
141 0.001 0.000 0.001 0.000 <frozen importlib._bootstrap>:398(ModuleSpec.parent)
Now, given I've only looked at this on CPython 3.9.5, and haven't thought through all of the possible issues that skipping the importer might give (eg: non standard packaging situations where it's not using one of the standard things like SourceFileLoader? IDK!), I certainly don't anticipate it being a simple 'yup commit it' (especially given asgiref's growing importance to the ecosystem, and thus it's reach) but I thought it worth opening a discussion, even if it's just to tell me I'm wrong in ways X/Y/Z.
Python imports are notoriously slow, but yes, I'm not quite sure I want to do it quite as you describe - however, I can see a situation where the thing imported is loaded onto a class attribute and then we can just do a if self.AsyncToSync is None:
guard around this section - or honestly, write a nice functional wrapper around the concept.
I'd happily take a PR for this, but I've not got time to work on it myself right now.