rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

feat(gazelle): pure golang helper

Open hunshcn opened this issue 1 year ago • 3 comments

Remove gazelle plugin's python deps and make it hermetic. No more relying on the system interpreter.

Use TreeSitter to parse Python code and use https://github.com/pypi/stdlib-list to determine whether a module is in std lib.

Fixes #1825

hunshcn avatar May 13 '24 13:05 hunshcn

@dougthor42, @linzhp, could you please test this PR in your repos? Does this speed things up? Does it slow things down? Since you two have bigger repos, it would be nice to use them as tests to ensure there are no hidden edge cases.

aignas avatar May 15 '24 09:05 aignas

Very excited to see this! I will test it in Uber later this week if that's ok.

linzhp avatar May 15 '24 14:05 linzhp

Oh this looks really great! Let's see how it does...

/cc @ssmall, another Googler who will be using this.

Summary

This branch causes Gazelle to take ~35% of the time, a savings of 40s for me. Hot damn that's awesome!

  • "Before this PR" commit: 55f31a306c8d69b46e50a1b2618c93081d4e11e8
  • "After this PR" commit: f6a190add150c12eeb862566de15df6f2f630444
Before this PR After this PR Delta %
60.8 s 21.2 s 40.6 s ~35%

Background info:

  • 4731 python files. find src -type f -name "*.py" | wc -l
  • 1618 directories. find src -type d -not -name "__pycache__" | wc -l
  • 499 BUILD.bazel files previously generated by Gazelle. find src -type f -name "BUILD.bazel" | wc -l
  • ~220 defined Python requirements in requirements.in
  • bazel version: 7.1.1
  • 24 core AMD EPYC vCPU
  • 96 GB RAM
  • SSD
  • Using experimental bazel downloader for pip.parse? yes.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux rodete
Release:        n/a
Codename:       rodete

Test:

  1. Run hyperfine --warmup 1 --runs 5 'bazel run //:gazelle'
    • only 5 runs instead of the default 10 because the old method takes >1min and I'm impatient.
    • warmup allows bazel downloader to cache my project's python dependencies
  2. Make sure that the git status of the test project is clean.

Before this PR:

Commit 55f31a306c8d69b46e50a1b2618c93081d4e11e8

$ hyperfine --warmup 1 --runs 5 'bazel run //:gazelle'
Benchmark 1: bazel run //:gazelle
  Time (mean ± σ):     60.756 s ±  0.867 s    [User: 53.651 s, System: 7.891 s]
  Range (min … max):   60.018 s … 61.873 s    5 runs

After this PR:

Commit f6a190add150c12eeb862566de15df6f2f630444

$ hyperfine --warmup 1 --runs 5 'bazel run //:gazelle'                                                                                                                                                                                                                                                                        
Benchmark 1: bazel run //:gazelle
  Time (mean ± σ):     21.190 s ±  0.174 s    [User: 27.083 s, System: 2.679 s]
  Range (min … max):   20.964 s … 21.392 s    5 runs

dougthor42 avatar May 15 '24 23:05 dougthor42

This works in Uber

linzhp avatar May 19 '24 21:05 linzhp