Benchmark GeoTIFF read speeds
Measuring how fast it takes to read GeoTIFF files (in terms of duration/time and throughput), setting up a baseline level of performance as we work towards addressing #5.
- Using three 50MB GeoTIFF files from https://github.com/kokoalberti/geotiff-benchmark of dtype u8, i16 and f32
- Set up CI to run the criterion.rs benchmarks that will be tracked on Codspeed over time
- Note that the benchmarks will only run on PRs labelled
run/benchmark, on themainbranch, or onrelease.
- Note that the benchmarks will only run on PRs labelled
To run locally:
# Download and extract GeoTIFF files
wget https://s3.us-east-2.amazonaws.com/geotiff-benchmark-sample-files/geotiff_sample_files.tar.gz -P resources
tar --extract --verbose --file resources/geotiff_sample_files.tar.gz
# Run benchmarks
cargo bench
Example output on my laptop:
Running benches/read_geotiff.rs (target/release/deps/read_geotiff-adb36ae75c76df4d)
Gnuplot not found, using plotters backend
read_geotiff_50MB/u8_dtype
time: [15.652 ms 15.730 ms 15.822 ms]
thrpt: [3.1601 GB/s 3.1786 GB/s 3.1945 GB/s]
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
read_geotiff_50MB/i16_dtype
time: [15.104 ms 15.154 ms 15.207 ms]
thrpt: [3.2879 GB/s 3.2995 GB/s 3.3103 GB/s]
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
read_geotiff_50MB/f32_dtype
time: [1.4039 ms 1.4068 ms 1.4099 ms]
thrpt: [35.463 GB/s 35.541 GB/s 35.616 GB/s]
Found 18 outliers among 100 measurements (18.00%)
4 (4.00%) low severe
9 (9.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
References:
- https://kokoalberti.com/articles/geotiff-compression-optimization-guide/
- https://docs.codspeed.io/benchmarks/rust
Ok, so the Codspeed Action (for tracking benchmarks over time) requires someone with admin permissions to install the CodspeedHQ GitHub App (see docs at https://docs.codspeed.io/importing-repositories). Have requested the install, and will re-run the benchmarks CI once that is enabled.
@georust/core (sorry for the wide ping, couldn't figure out who are admins...), just checking if it's possible to install the Codspeed GitHub App so as to track benchmarks over time in georust/geotiff here. I've requested an installation through https://github.com/apps/codspeed-hq/installations/select_target already following instructions at https://docs.codspeed.io/integrations/providers/github, but unsure who gets the email to approve the installation.
@weiji14 I approved Codspeed for this repository. Let me know if that worked
CodSpeed Performance Report
Congrats! CodSpeed is installed 🎉
🆕 0 new benchmarks were detected.
You will start to see performance impacts in the reports once the benchmarks are run from your default branch.
Detected benchmarks
Thanks @frewsxcv! Looks like it is working now, the benchmark page is up at https://codspeed.io/georust/geotiff :tada:
FYI codspeed just asked for some more permissions that I'd prefer not to enable org wide:
Do you know if there's a different solution?
It also seems like there's no data in codspeed:
Perhaps because it's been so long since it's been run? I suppose a free service can't offer to keep our perf data around forever.
For my own perf measurements, I'm just doing local before/after benchmarks with criterion. Sometimes I'll record the output in git for posterity. It's admittedly lacking flashy dashboards, but it's simple to maintain.
Since it doesn't seem like this is being very actively used in practice, I'm inclined to disable the extension. What do you think @weiji14?
It also seems like there's no data in codspeed:
Perhaps because it's been so long since it's been run? I suppose a free service can't offer to keep our perf data around forever.
There's no data because this PR isn't merged into the main branch yet. I'm not sure how long they keep the perf data, but another project I'm maintaining with Codspeed enabled has benchmarks going back a year (to when the GitHub Action was setup).
Since it doesn't seem like this is being very actively used in practice, I'm inclined to disable the extension. What do you think @weiji14?
According to https://docs.codspeed.io/integrations/providers/github#permissions and https://discord.com/channels/1065233827569598464/1333477558406090905/1333478376048037979, the "Administration (Read/Write)" permissions are required to register CodSpeed Macro runners on repositories. We're not using those self-hosted runners, so those permissions won't be needed, and I think it is ok to ignore them (the app should still work with the old permissions).
That said, I'm also ok with disabling the extension if you think it is a privacy concern, and I'll refactor this PR to just have the benchmark scripts without the GitHub Action workflow. Either way, I won't merge this PR until another maintainer approves.
Got an email yesterday that Codspeed is dropping the "Repository: admin" permission :tada: You can see their updated permissions on https://docs.codspeed.io/integrations/providers/github#permissions
In that case, we can probably go ahead with this PR as is?
On the downside, it's requesting org wide permission to modify actions, which I don't think is tenable. Or am I misunderstanding?
The two organization permissions I see here are:
The permission seems to apply to self-hosted runners, not the standard GitHub-hosted runners. See also https://docs.github.com/en/rest/authentication/permissions-required-for-fine-grained-personal-access-tokens?apiVersion=2022-11-28#organization-permissions-for-self-hosted-runners. Are there self-hosted runners being used elsewhere in the georust org?
Note that we are not using the Codspeed's macro runners here in CI (which requires setting runs-on: codspeed-macro instead of runs-on: ubuntu-latest). So no self-hosted runners are being used.
Ah thank you for clarifying that it's only org-wide self-hosted runners.
Are there self-hosted runners being used elsewhere in the georust org?
I've just confirmed (I think?) that no one in the organization is currently using self hosted runners.
$ gh api /orgs/georust/actions/runners
{
"total_count": 0,
"runners": []
}