native [ffigen] Benchmarks running FFIgen

We have performance concerns for ffigen when run on large ObjC APIs, as the bindings can be >100k lines. We should add a benchmark that runs ffigen on one of the Apple frameworks and measures:

Clang parse time
Binding generation time (without formatting)
Formatting time
Size of the generated code (in lines and bytes)
Max memory use

The benchmark should also run the analyzer over the generated code and measure:

Analysis time
Max memory use

This benchmark is easy to write. The part I'm not sure about is how to integrate it into github CI. Is there a good framework for continuous benchmarking? Or should we just run the benchmark script locally as needed?

Aug 14 '24 04:08 liamappelbe

@HosseinYousefi jnigen should probably also have something like this.

Aug 14 '24 04:08 liamappelbe

Is there a good framework for continuous benchmarking?

Maybe we can teach https://github.com/benchmark-action/github-action-benchmark to understand the output of https://pub.dev/packages/benchmark_harness.

github-action-benchmark can leave comments on PRs if the benchmark results exceed a threshold. Which would be nice.

However, the benchmark_harness only helps with a single benchmark file. I'd say we probably also want something that finds all benchmark files and runs them in sequence: dart run benchmark_harness?

@mit-mit Are you aware of any benchmarking solutions (for use via GitHub actions) by our community?

@vaind I see you own package:benchmarking. Any suggestions from your side?

Aug 14 '24 06:08 dcharkes

Any suggestions from your side?

Not really, other than that I believe #718 and #435 are more impactful if addressed first as they're about tracking performance of the generated code that ends up in everyone's app, rather than the one-time codegen paid by the developer.

Aug 15 '24 07:08 vaind