asgiref icon indicating copy to clipboard operation
asgiref copied to clipboard

Discussion: Possible minor performance improvement in Local._get_context_id

Open kezabelle opened this issue 3 years ago • 1 comments

This is rudimentary synthetic benchmarking and investigation. Caveat emptor, and apologies if this is wrong and a waste of your time reviewing it.

Whilst looking at the yappi/cProfile output for a couple of tickets in Django proper, I stumbled on a small amount of time being spent in Local._get_context_id() (because of things like i18n machinery). Using line_profiler and the %lprun ipython magic it seems like 20% of time is being spent on:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    51    100001     222250.0      2.2     20.1          from .sync import AsyncToSync, SyncToAsync

which is notably and understandably to Prevent a circular reference

Using the following ad-hoc test:

from asgiref.local import Local
x = Local()
def timeonly():
    setattr(x, 'test', None)
    for _ in range(100_000):
        getattr(x, 'test')
if __name__ == '__main__':
    timeonly()

in ipython:

In [1]: from local_test import timeonly

In [2]: %timeit timeonly()
418 ms ± 2.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit timeonly()
418 ms ± 4.86 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit -n100 timeonly()
511 ms ± 29.9 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

The change I'm asking you to consider (given any issues around thread-safety or monkeypatched modules which I can't begin to fathom) is:

modules = sys.modules
if 'asgiref.sync' in modules:
    sync = modules['asgiref.sync']
    AsyncToSync, SyncToAsync = sync.AsyncToSync, sync.SyncToAsync
else:
    from .sync import AsyncToSync, SyncToAsync

or some equivalent thereof which meets your preferred style.

Notably it doesn't alter the percentage of time spent, according to line profiler (ie it's still basically 20%):

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    51    100001      48007.0      0.5      4.5          modules = sys.modules
    52    100001      48612.0      0.5      4.6          if 'asgiref.sync' in modules:
    53    100000      49366.0      0.5      4.6              sync = modules['asgiref.sync']
    54    100000      52234.0      0.5      4.9              AsyncToSync, SyncToAsync = sync.AsyncToSync, sync.SyncToAsync
    55                                                   else:
    56         1       4183.0   4183.0      0.4              from .sync import AsyncToSync, SyncToAsync

but does appear to be quicker overall:

In [1]: from local_test import timeonly

In [2]: %timeit timeonly()
319 ms ± 1.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit timeonly()
322 ms ± 4.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit -n100 timeonly()
378 ms ± 16.8 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

As far as I can tell from running the ad-hoc test (above) through yappi the saving is skipping the parts of the importer which ultimately do something with a ModuleSpec.parent (amounting to 6%-8% of time, according to pycharm's pstats visualiser):

   ncalls  tottime  percall  cumtime  percall filename:lineno(function
   100141    0.377    0.000    0.556    0.000 <frozen importlib._bootstrap>:398(ModuleSpec.parent)

vs (after proposed patch):

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      141    0.001    0.000    0.001    0.000 <frozen importlib._bootstrap>:398(ModuleSpec.parent)

Now, given I've only looked at this on CPython 3.9.5, and haven't thought through all of the possible issues that skipping the importer might give (eg: non standard packaging situations where it's not using one of the standard things like SourceFileLoader? IDK!), I certainly don't anticipate it being a simple 'yup commit it' (especially given asgiref's growing importance to the ecosystem, and thus it's reach) but I thought it worth opening a discussion, even if it's just to tell me I'm wrong in ways X/Y/Z.

kezabelle avatar Jun 08 '21 14:06 kezabelle

Python imports are notoriously slow, but yes, I'm not quite sure I want to do it quite as you describe - however, I can see a situation where the thing imported is loaded onto a class attribute and then we can just do a if self.AsyncToSync is None: guard around this section - or honestly, write a nice functional wrapper around the concept.

I'd happily take a PR for this, but I've not got time to work on it myself right now.

andrewgodwin avatar Jun 08 '21 15:06 andrewgodwin