burn Refactor serialization of benchmarks

See changes below for a list of changes.

I will add an "upload" function to the persistence module.

~~The current naming of the saved files need a bit of work, currently I just do benchmarks_<timestamp>.json but we could have some race condition and get some files overridden.~~ I changed the format see comment: https://github.com/tracel-ai/burn/pull/1131#issuecomment-1887327547

The computed values are saved in seconds, maybe we should save them in milliseconds ?

You can see an example of produced file here: https://gist.github.com/syl20bnr/b0fdbbf0c3877b9e4ddf262a9cc388ac

Checklist

[x] Confirmed that run-checks all script has been executed.
[x] Made sure the book is up to date with changes in this PR.

Related Issues/PRs

https://github.com/tracel-ai/burn/issues/1072

Changes

flatten benchmarks data to make it easier to save documents to a database and query them
split some information into their own fields like backend and device
add new serialized info:
- computed values (mean, median, variance, min, max)
- number of samples
- operation name
- tensor shapes if any
serialize to separate files, one file per benchmark run
simplify persistence module to only a save method

Testing

Tested serialization to disk with all the available benchmarks.

Jan 11 '24 03:01 syl20bnr

For the .expect() messages, I am not sure what to put in them. I see a lot of people putting an explanation of the error there and also sometimes people putting what is expected to work. The word "expect" suggests the latter but that's not what I see online most of the time 🤷🏻‍♂️ What's our code guideline on this ?

Jan 11 '24 03:01 syl20bnr

Codecov Report

Attention: 80 lines in your changes are missing coverage. Please review.

Comparison is base (f43b686) 85.65% compared to head (94a214b) 85.67%. Report is 3 commits behind head on main.

Files	Patch %	Lines
backend-comparison/src/persistence/base.rs	0.00%	51 Missing :warning:
burn-common/src/benchmark.rs	73.33%	28 Missing :warning:
burn-compute/src/tune/tune_benchmark.rs	0.00%	1 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1131      +/-   ##
==========================================
+ Coverage   85.65%   85.67%   +0.02%     
==========================================
  Files         513      513              
  Lines       56816    56987     +171     
==========================================
+ Hits        48665    48823     +158     
- Misses       8151     8164      +13

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jan 11 '24 03:01 codecov[bot]

I updated the file name with a bench name and the first 8 chars of a random uuid, examples:

benchmarks_gelu_1b2bb391.json
benchmarks_matmul_f79409f7.json

The name is passed as an argument to the save function because each ran bench in a benchmark can have a different name and operation used. For instance with gelu where we have 3 batches of benches with different gelus.

Jan 11 '24 14:01 syl20bnr

For the .expect() messages, I am not sure what to put in them. I see a lot of people putting an explanation of the error there and also sometimes people putting what is expected to work. The word "expect" suggests the latter but that's not what I see online most of the time 🤷🏻‍♂️ What's our code guideline on this ?

From here: If you’re having trouble remembering how to phrase expect error messages remember to focus on the word “should” as in “env variable should be set by blah” or “the given binary should be available and executable by the current user”.

So I just stick to the word should and it works out

Jan 11 '24 14:01 louisfd

It's very good in general. The only issue I see is how often we must tell the name of the benchmark (see my comments for example on unary). Could it not be unified in one place?

You are right, I thought about it but I cannot fetch the binary name at runtime, the only references are in Cargo, in the command line, and the source file name. I'll try to find something.

Jan 11 '24 15:01 syl20bnr

Ready for another round of review. Updated gist: https://gist.github.com/syl20bnr/b0fdbbf0c3877b9e4ddf262a9cc388ac

Files on disk naming examples:

  bench_binary_1705007825641.json
  bench_from_data_1705007836909.json
  bench_gelu_reference_1705007872731.json
  bench_gelu_withcustomerf_1705007872987.json
  bench_gelu_withreferenceerf_1705007872929.json
  bench_matmul_1705007855627.json
  bench_to_data_1705007832378.json
  bench_unary_1705007818349.json

I added num_repeats also to the documents, I believe they can be useful.

Jan 11 '24 21:01 syl20bnr

@nathanielsimard I already save the number of samples. See the gist.

I believe the repeat can still be useful, check the docstring I added to it, I think it is a good concept to have:

    /// Number of executions per sample

Say we have a very quick operation to bench, repeating it to get a meaningful sample can be useful.

Jan 12 '24 14:01 syl20bnr

Ideally the notion of repeat could be integrated into the main loop but I tried it and it was not as trivial to do.

Jan 12 '24 14:01 syl20bnr

We decided to remove the num_repeats completely from the benchmarks.

Jan 12 '24 15:01 syl20bnr

burn burn copied to clipboard

Refactor serialization of benchmarks

Checklist

Related Issues/PRs

Changes

Testing

Codecov Report

burn
burn copied to clipboard