typescript-go icon indicating copy to clipboard operation
typescript-go copied to clipboard

Use Profile-Guided Optimization (PGO) for the compiler itself

Open zamazan4ik opened this issue 9 months ago • 5 comments

Hi!

Since the compiler is to be rewritten in Go, I suggest considering using the Profile-Guided Optimization (PGO) option to optimize the Typescript compiler itself. Go supports PGO since 1.20 so it's an available option for the project. PGO for compiler-like workloads works especially well - e.g. check my PGO benchmarks for compilers (even if most of are for non-Go projects, the results should be pretty the same for Go since it uses the same ideas).

I suggest you the following plan:

  • Perform PGO benchmarks for the compiler. It will require thinking about the training workload - I think compiling any Typescript project would be a good option for most of the use-cases
  • Providing some scripts for simplifying building the compiler with PGO
  • Integrating PGO optimization step into the CI pipeline so end-users will get a PGO-optimized version of the compiler.

I understand that the project is in its early-stages and probably there are more important things to finish at the moment. If this is true, just consider the issue as a point of improvement for the future versions. I believe that improving the compiler performance is a valuable-enough things for the end-users.

Thank you.

P.S. If you think that Discussions is a better place for such issues - feel free to move it there.

zamazan4ik avatar Mar 18 '25 05:03 zamazan4ik

The most recent time I tried PGO in this repo, we actually got slower! Definitely need to retest and file a new issue if that's still true.

jakebailey avatar Mar 18 '25 05:03 jakebailey

Wow, sounds pretty bad! I personally would be interested to see such a training/bench suite! If the training workload is pretty representative (for bench purposes at least for a start it's fine to have the same training and bench suites), and we can reproducibly catch the slowdown - it's worth reporting to the Go compiler upstream, IMHO.

zamazan4ik avatar Mar 18 '25 05:03 zamazan4ik

$ rm -rf built
$ hereby build
$ mv built/local built/local-old
$ ./built/local-old/tsgo -p ~/work/vscode/src --pprofDir=.
$ mv *-cpuprofile.pb.gz ./cmd/tsgo/default.pgo
$ hereby build
$ hyperfine -w=1 './built/local-old/tsgo -p ~/work/vscode/src' './built/local/tsgo -p ~/work/vscode/src'
Benchmark 1: ./built/local-old/tsgo -p /home/jabaile/work/vscode/src
  Time (mean ± σ):      8.657 s ±  0.791 s    [User: 53.821 s, System: 7.241 s]
  Range (min … max):    8.200 s … 10.878 s    10 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: ./built/local/tsgo -p /home/jabaile/work/vscode/src
  Time (mean ± σ):      9.396 s ±  1.398 s    [User: 50.923 s, System: 7.156 s]
  Range (min … max):    8.137 s … 12.366 s    10 runs
 
Summary
  ./built/local-old/tsgo -p /home/jabaile/work/vscode/src ran
    1.09 ± 0.19 times faster than ./built/local/tsgo -p /home/jabaile/work/vscode/src

Not exactly scientific; there's so much noise.

jakebailey avatar Mar 19 '25 04:03 jakebailey

For reducing the noise, I suggest you additionally:

  • Use CPU pinning (on Linux it's taskset -c command. With this you reduce the CPU scheduler noise
  • Increase the number of warmups
  • Increase the number of test runs

Currently, it's definitely hard to say, does PGO help or not for the project.

zamazan4ik avatar Mar 19 '25 04:03 zamazan4ik

Yeah, I have to run this on my dedicated perf machine, it's just currently configured to segment off one physical core for tsc benchmarking, but now we have all of these cores so I have to figure something else out...

jakebailey avatar Mar 19 '25 04:03 jakebailey