rules_python Poor unit test performance using py

I'm in the process of switching our rather large Python codebase over to using Bazel and have just finished getting our entire unit test suite to run using Bazel. Our tests are written using unittest and we leveraged pytest as our test runner. Typically a full run of the test suite (~6000 tests) would take around 10 minutes. However, with Bazel I'm now observing run times of about 40 minutes. Of course, once things are cached it is super fast.

I knew to expect some overhead but I haven't seen anything like this with other Bazel projects I've worked on.

I went ahead and collected some timings/profile information using the analyize-profile option that Bazel has for subsets of our unit tests. bazel clean was run before each timing run.

First for our web package: ./bazel test tests/python/web/.... This consist of 70 test targets. Profile information:

=== PHASE SUMMARY INFORMATION ===

Total launch phase time         6.702 s    2.35%
Total init phase time           0.267 s    0.09%
Total loading phase time        1.632 s    0.57%
Total analysis phase time       5.197 s    1.82%
Total preparation phase time    0.032 s    0.01%
Total execution phase time    271.452 s   95.15%
Total finish phase time         0.017 s    0.01%
------------------------------------------------
Total run time                285.301 s  100.00%

Critical path (165.526 s):
       Time Percentage   Description
     392 ms    0.24%   action 'Creating source manifest for //tests/python/web/controllers/api listing_check_ins_test'
   16.616 s   10.04%   action 'Creating runfiles tree bazel-out/darwin-py2-fastbuild/bin/tests/python/web/controllers/api/listing_check_ins_test.runfiles'
    12.5 ms    0.01%   runfiles for //tests/python/web/controllers/api listing_check_ins_test
  148.505 s   89.72%   action 'Testing //tests/python/web/controllers/api listing_check_ins_test'

So since this listing_check_ins_test (which is actually a slow test) seems to be the critical path I tried running it on its own:

=== PHASE SUMMARY INFORMATION ===

Total launch phase time         0.060 s    0.12%
Total init phase time           0.139 s    0.28%
Total loading phase time        1.616 s    3.28%
Total analysis phase time       4.551 s    9.25%
Total preparation phase time    0.017 s    0.04%
Total execution phase time     42.820 s   87.00%
Total finish phase time         0.016 s    0.03%
------------------------------------------------
Total run time                 49.221 s  100.00%

Critical path (42.567 s):
       Time Percentage   Description
     154 ms    0.36%   action 'Creating source manifest for //tests/python/web/controllers/api listing_check_ins_test'
     920 ms    2.16%   action 'Creating runfiles tree bazel-out/darwin-py2-fastbuild/bin/tests/python/web/controllers/api/listing_check_ins_test.runfiles'
    11.6 ms    0.03%   runfiles for //tests/python/web/controllers/api listing_check_ins_test
   41.481 s   97.45%   action 'Testing //tests/python/web/controllers/api listing_check_ins_test'

It seems noteworthy that when run by itself it's 3x faster.

At this point I wanted to compare to pytest.

When testing the web package using pytest we see the following result: 216 passed in 96.19 seconds

When running the single slow test with pytest: 11 passed in 32.79 seconds.

Interestingly when running the executable that Bazel produces directly (./listing_check_ins_test) I observe a similar time to that of pytest.

Ultimately Bazel profiling seems to indicate:

That the majority of time is spent in the execution phase.
Running tests in parallel seems to very negatively impact test speed.
At least what Bazel is outputting when run on its own has the same performance.

Any advice on how to continue to investigate this further would be appreciated.

May 14 '20 15:05 cwaeland

Hey @cwaeland, thanks for the detail you've put in here. Unfortunately, I don't have any immediate advice. I've experienced slowness in py_test relative to using pytest or nose outside of Bazel, but caching has kept developer efficiency from decreasing on our (smaller) Python test suite.

May 23 '20 06:05 thundergolfer

I am using Bazel for a Python-only codebase that is fairly small, and I see large build times (not only tests, just building is slow). Right now it takes about 10 minutes to do a full build on CI, while there is only about 10-15 Python packages. For comparison, I am able to perform a full build on another repository with only Java code in about 4 minutes (the codebase is slightly larger). Again, I am only counting build time, not test time.

I don't have any precise figure, but I had the impression that it is related to the number and size of external dependencies. In particular, at some point, building the runfiles tree was taking forever. I would be very interested in understanding better the performance bottlenecks in Python…

Aug 28 '20 13:08 pvcnt

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

Apr 14 '21 22:04 github-actions[bot]

Hi there, we are using python 3.8 with bazel, and we have the same experience that py_test is really slow, compared with native python run, it is slower 3x at least. Any idea or suggestions for this?

Sep 15 '22 09:09 tjulinfan

@thundergolfer any updates?

Jul 11 '23 10:07 ramilmsh

As mentioned in the OP, this is probably a result of running tests in parallel. Some test suites, e.g. for Django, will create a database during setup. When you only invoke pytest once for the entire test suite, that cost is only paid once. If, like Bazel, you invoke pytest once per test, then all the setup, like creating a database, is multiplied by the number of tests. Could that explain the slowdown you're seeing?

Oct 06 '23 07:10 RJPercival

@RJPercival not in my case, because i run all tests in a service as 1 bazel test (something i didn't get around to fixing yet) and all the services are started in my CI pipeline before the build/test stage entirely (something i don't even know how to fix, because i haven't figured out how to run integration tests in bazel)

Oct 06 '23 07:10 ramilmsh

I wonder if #1653 is related? Shall we close this issue in favour of the newer one?

Jan 22 '24 08:01 aignas

@aignas i am not using a hermetic toolchain and i appear to experience the issue anyway. i can benchmark to know for sure, if that helps

Jan 22 '24 09:01 ramilmsh

@aignas the poor performance occurs even without using a hermetic toolchain. However, a hermetic toolchain would worsen it.

Jan 22 '24 16:01 cwaeland

I think the thing to do is rule out "lots of files in runfiles" as the culprit (or otherwise isolate that from performance analysis). It's easy to end up needing thousands of files after adding a few third party dependencies, which basically re-creates the situation with the hermetic runtimes -- lots of files need to be materialized in the runfiles.

There's a couple known issues in Bazel that (1) this is a lot of duplicated work (the runtime and external deps are the same files for each target) and (2) macs in particular are harder hit by the cost of creating thousands of runfiles for some mac-specific reason.

Jan 22 '24 20:01 rickeylev

rules_python
rules_python copied to clipboard

Poor unit test performance using py_test rule

rules_python rules_python copied to clipboard

Poor unit test performance using py_test rule

rules_python
rules_python copied to clipboard