numba icon indicating copy to clipboard operation
numba copied to clipboard

Expensive Calls to `ir.Inst.list_vars` During Compilation

Open srilman opened this issue 1 year ago • 0 comments

Reporting a bug

  • [x] I have tried using the latest released version of Numba (most recent is visible in the release notes (https://numba.readthedocs.io/en/stable/release-notes-overview.html).
  • [x] I have included a self contained code sample to reproduce the problem. i.e. it's possible to run as 'python bug.py'.

From the perf test in issue https://github.com/numba/numba/issues/9700, we saw that cumulatively calls to function ir.Inst.list_vars took about 12.7% of total compilation time. That's primarily because all calls to list_vars calls the internal helper function _rec_list_vars that generally iterates over a dictionary-like data structure (usually the self.__dict__ of a subclasses object) to find all ir.Var objects, which is pretty expensive.

For most of the subclasses of ir.Inst, the class attributes that are ir.Var is known beforehand, because the construct verifies it via assertions. Thus, we don't need to recursively search for the result on those cases. We can override the implementation to statically return the output.

Code can be found in https://github.com/numba/numba/issues/9714 and relavent cProfile output is

Details

In addition to the ir.py functions, they call list.append and list.extend a lot.

780304/123711    0.528    0.000    0.777    0.000 ir.py:318(_rec_list_vars)
    99202    0.021    0.000    0.659    0.000 ir.py:351(list_vars)
    68576    0.021    0.000    0.387    0.000 ir.py:608(list_vars)

   783706    0.073    0.000    0.073    0.000 {method 'append' of 'list' objects}
   628873    0.061    0.000    0.061    0.000 {method 'extend' of 'list' objects}

srilman avatar Aug 27 '24 20:08 srilman