1brc icon indicating copy to clipboard operation
1brc copied to clipboard

Use hyperfine and jq to improve evaluate.sh

Open hundredwatt opened this issue 1 year ago • 2 comments

From #105

For lack of a better name, I called this script evaluate2.sh

$ ./evaluate2.sh
Usage: evaluate2.sh <fork name> (<fork name 2> ...)

This script does the following:

  • Runs same setup as evaluate.sh (java version, maven, measurements.txt)
  • Uses hyperfine for pretty progress tracking and results
  • Since hyperfine only displays the mean, uses jq to calculate the trimmed mean in accordance with this competition's rules

Example Output (to keep this video short, it skips maven and uses the 20 line measurements file 😄 ):

Screenshot

image

GIF

2024-01-06 12 38 35

hundredwatt avatar Jan 06 '24 19:01 hundredwatt

I added output of the raw times in addition to the trimmed means, eg:

fork,trimmed_mean
spullara,0.52814592528
royvanrijn,0.49758428661333337

fork,raw_times
spullara,0.51465599528,0.53491328628,0.52979903628,0.51972545328,0.53921453728
royvanrijn,0.49466057828000004,0.49655945328,0.49698236928,0.4992110372800001,0.50892974428

hundredwatt avatar Jan 06 '24 19:01 hundredwatt

Very cool! One problem I see is that this moves the location of time measurement from just the java call to the launch script, which penalizes any contenders which call sdk for setting up a specific distro (as an example, I see the same time for merykitty as before, but +300ms for royvanrijn). We'd somehow have to extract this step.

gunnarmorling avatar Jan 07 '24 10:01 gunnarmorling

New look, based on: https://github.com/gunnarmorling/1brc/issues/105#issuecomment-1881557239

image

hundredwatt avatar Jan 08 '24 23:01 hundredwatt

@gunnarmorling Yes, we can do hyperfine --output <FILE>, I'll take a look at process_output.java shortly

hundredwatt avatar Jan 09 '24 15:01 hundredwatt

I'll take a look at process_output.java shortly

Excellent, thx! It's invoked via ./process.sh (and sorry for the code, I know it's a hot mess... ;)

gunnarmorling avatar Jan 09 '24 16:01 gunnarmorling

Remaining TODO:

  • [ ] Save output to file
  • [ ] Incorporate functionality from process_output.java

hundredwatt avatar Jan 09 '24 16:01 hundredwatt

I'll take a look at process_output.java shortly

BTW bash uses builtin time, the GNU /usr/bin/time supports TIME environment variable, see https://man7.org/linux/man-pages/man1/time.1.html

bash:

$ time sleep 1

real    0m1,009s
user    0m0,003s
sys     0m0,006s

/usr/bin/time has quite verbose default format which could be reduced just to real to avoid any kind of parsing.

$ /usr/bin/time sleep 1
0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 2240maxresident)k
0inputs+0outputs (0major+78minor)pagefaults 0swaps

$ export TIME="%e"
$ /usr/bin/time sleep 1
1.00

Note also that default sh in Ubuntu is dash

~$ ll /bin/sh
lrwxrwxrwx 1 root root 4 Sep 27  2018 /bin/sh -> dash*

that does not have time builtin so all scripts that use #!/bin/sh and time (not /usr/bin/time) in Ubuntu print

0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 2240maxresident)k
0inputs+0outputs (0major+78minor)pagefaults 0swaps

source sdk also does not work with dash.

I.e. for the next challenge baseline should stick to #!/bin/bash and use /usr/bin/time - then output could be configured via TIME :)

AlexanderYastrebov avatar Jan 09 '24 16:01 AlexanderYastrebov

process_output.java is used as follows:

  1. Expected output is created via ./eval.sh baseline; cp baseline.out out_expected.txt
  2. Fork output is created via ./eval.sh $fork
  3. Invoke process.sh ./process.sh $fork
  4. process_output.java then: a. Verifies the printed aggregation matches baseline b. Collects the times c. Computes the trimmed mean d. Prints the leaderboard line in Markdown table format

evaluate2.sh already handles b. and c., so we just need to add a. and d.

Let me know if I missed anything 😄

hundredwatt avatar Jan 09 '24 16:01 hundredwatt

Yepp, that sounds exactly right 👍 .

gunnarmorling avatar Jan 09 '24 16:01 gunnarmorling

Ugh, hyperfine --output <FILE> overwrites FILE on each run... it doesn't append.

For now we'll only check the output of 1 run unless we can find a workaround

hundredwatt avatar Jan 09 '24 17:01 hundredwatt

For now we'll only check the output of 1 run unless we can find a workaround

So if we run this script multiple times for multiple contenders, will we check the output of the last run for all contenders (ok)? Or just the last run of the last contender (bad)?

gunnarmorling avatar Jan 09 '24 17:01 gunnarmorling

I've created https://github.com/gunnarmorling/1brc/pull/266 as a follow-up to this one, re-organizing the existing launch scripts to adhere to the structure established here.

gunnarmorling avatar Jan 09 '24 17:01 gunnarmorling

So if we run this script multiple times for multiple contenders, will we check the output of the last run for all contenders (ok)? Or just the last run of the last contender (bad)?

The former, we're ok 👍

hundredwatt avatar Jan 09 '24 17:01 hundredwatt

I pushed all the changes from process_output.java!

New look:

Happy Path

image

Verification failed

image

Going to do another round of testing on a Fedora box too

hundredwatt avatar Jan 09 '24 17:01 hundredwatt

For fun, the Leaderboard text now scans ./prepare_$fork.sh and extracts the Java version from sdk use if present 😄

hundredwatt avatar Jan 09 '24 17:01 hundredwatt

Oops, forgot the SMT / turbo stuff

hundredwatt avatar Jan 09 '24 18:01 hundredwatt

That's awesome, really great stuff, @hundredwatt! I'm gonna squash everything into one commit and merge it. We can do any necessary fine-tuning in follow-up PRs. Thanks a lot for pulling through with this one!

gunnarmorling avatar Jan 09 '24 19:01 gunnarmorling

For fun, the Leaderboard text now scans ./prepare_$fork.sh and extracts the Java version from sdk use if present 😄

Wanted to suggest exactly that, but you beat me to it. Seems we can (re-)generate the leaderboard fully automatically with this change. Thanks again!

gunnarmorling avatar Jan 09 '24 19:01 gunnarmorling

My pleasure, happy to help out! Thanks for putting together such a fun, challenging and educational contest @gunnarmorling!

Hopefully I’ll find some time to submit my own entry once we run out of infra things to improve 😂

On Tue, Jan 9, 2024 at 12:52 PM Gunnar Morling @.***> wrote:

Merged #182 https://github.com/gunnarmorling/1brc/pull/182 into main.

— Reply to this email directly, view it on GitHub https://github.com/gunnarmorling/1brc/pull/182#event-11434836176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAWLOOOOTQJ3JJFL4JMSGLYNWNWVAVCNFSM6AAAAABBPYITNOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGQZTIOBTGYYTONQ . You are receiving this because you were mentioned.Message ID: @.***>

hundredwatt avatar Jan 09 '24 20:01 hundredwatt