snark Benchmarking infrastructure ideas

This issue is intended for discussion on what types of infrastructure we should add, to go about benchmarking the constraint generation frameworks performance. I have some initial ideas below:

We add a new "constraints-benchmark" crate somewhere, which will include constraint generators for some sizable circuits of interest.
We augment the reporting tools on constraint system to report the number of symbolic LC's, and their average density. If possible, would be nice to display these under namespaces.
We make scripts for generating the benchmark binaries, and profiling them with tools like perf, massif, etc. I know we can convert perf's output to flamegraphs, and massif has massif-visualizer. Not sure if we can make any of the above output graphviz svg's like pprof in golang. (I've seen some blog posts claiming to do this with perf, but it didn't work for me last time I tried it)

Some remaining questions are

can we keep track of constraint count / constraint density regressions in an automated way?
What circuits should we be profiling on? The recursive circuits immediately come to mind, since thats what many of us are working on.
Where should the constraints-benchmark crate be? I currently think it ought to be its own repo

Thoughts on any of the above / alternate ideas?

For Admin Use

[ ] Not duplicate issue
[ ] Appropriate labels applied
[ ] Appropriate contributors tagged
[ ] Contributor assigned/self-assigned

Jan 05 '21 16:01 ValarDragon

We augment the reporting tools on constraint system to report the number of symbolic LC's, and their average density. If possible, would be nice to display these under namespaces.

Agreed, these would be quite useful. I think having a general tool to profile constraints would be nice too.

We make scripts for generating the benchmark binaries, and profiling them with tools like perf, massif, etc. I know we can convert perf's output to flamegraphs, and massif has massif-visualizer.

This probably belongs in its own repo, as these tools are probably helpful in benchmarking other components too.

Not sure if we can make any of the above output graphviz svg's like pprof in golang. (I've seen some blog posts claiming to do this with perf, but it didn't work for me last time I tried it)

Does this work?

can we keep track of constraint count / constraint density regressions in an automated way?

Probably something like this should help?

What circuits should we be profiling on? The recursive circuits immediately come to mind, since thats what many of us are working on.

I think we should have a benchmark suite of useful constraint systems, including recursive constraints, hash functions, group/field operations, etc.

Jan 05 '21 19:01 Pratyush

I've used that flame graph package and gotten it to work before. In golang theres a tool called pprof which is amazing though and I vastly prefer. e.g. here is a graph of output you can get: https://user-images.githubusercontent.com/1709417/55467675-155f4800-5602-11e9-9c69-5b5f98a443a8.png, telling you what percent of the time is spent executing a given function, and what percent is spent executing functions that it calls. (You can also get it to show you the line by line ASM with some extra flags) The flame graph makes getting a sense of that much harder imo.

Probably something like this should help?

It'd be nice if we could get the expected result from another source, or maybe a file in the repo, which we have a bot to maintain

Jan 05 '21 19:01 ValarDragon

I've used that flame graph package and gotten it to work before. In golang theres a tool called pprof which is amazing though and I vastly prefer. e.g. here is a graph of output you can get: user-images.githubusercontent.com/1709417/55467675-155f4800-5602-11e9-9c69-5b5f98a443a8.png, telling you what percent of the time is spent executing a given function, and what percent is spent executing functions that it calls. (You can also get it to show you the line by line ASM with some extra flags) The flame graph makes getting a sense of that much harder imo.

I agree that that visualization is much better; maybe this tool should help? https://pingcap.medium.com/quickly-find-rust-program-bottlenecks-online-using-a-go-tool-37be0546aa8f

It'd be nice if we could get the expected result from another source, or maybe a file in the repo, which we have a bot to maintain

That library does have the expect_file macro, which allows reading the values from a file. The rust compiler also has this kind of infrastructure, for testing regression in error messages. We could look there if we need more powerful tools than expect_file. There's also this more powerful library: https://insta.rs/ (It's really polished)

Jan 05 '21 20:01 Pratyush

snark snark copied to clipboard

Benchmarking infrastructure ideas

For Admin Use

snark
snark copied to clipboard