1brc
1brc copied to clipboard
SIMD parsing newlines, integer parsing, custom hashtable with SIMD lookup table for equality
Thanks for this amazing competition!
Here's my implementation. I must have sunk 100+ hours into background benchmarking of this. I learned a lot! I purposely didn't look at any other submissions so I've no idea how similar they are. Looking forward to learning from the crazy fast submissions from others.
Check List:
- [x] You have run
./mvnw verify
and the project builds successfully - [x] Tests pass (
./test.sh <username>
shows no differences between expected and actual outputs) - [x] All formatting changes by the build are committed
- [x] Your launch script is named
calculate_average_<username>.sh
(make sure to match casing of your GH user name) and is executable - [x] Output matches that of
calculate_average_baseline.sh
- [x] For new entries, or after substantial changes: When implementing custom hash structures, please point to where you deal with hash collisions (line number) -> line
480
(slotIndex = (slotIndex + 1) % NUM_SLOTS;
)
- Execution time:
3.974s
i9 7900X, 10 Cores, 20 HT, 64GB RAM. Expecting it to take~5s
on the Hetzner AX161. - Execution time of reference implementation:
3m28.684s
i9 7900X, 10 Cores, 20 HT, 64GB RAM
Please run the formatter and amend the PR with the changes.
This fails on the 10K keyset test (see create_measurements_3.sh) with 1B rows. Seeing these:
Exception in thread "Thread-8" java.lang.IndexOutOfBoundsException: Index 262139 out of bounds for length 262137
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
at java.base/java.util.Objects.checkIndex(Objects.java:385)
at jdk.incubator.vector/jdk.incubator.vector.VectorIntrinsics.checkFromIndexSize(VectorIntrinsics.java:61)
at jdk.incubator.vector/jdk.incubator.vector.ByteVector.fromArray(ByteVector.java:2975)
at dev.morling.onebrc.CalculateAverage_chrisbellew$ThreadProcessor.processBuffer(CalculateAverage_chrisbellew.java:339)
at dev.morling.onebrc.CalculateAverage_chrisbellew$ThreadProcessor.processRange(CalculateAverage_chrisbellew.java:303)
at dev.morling.onebrc.CalculateAverage_chrisbellew$ThreadProcessor.run(CalculateAverage_chrisbellew.java:279)
Fixed the issue, thanks @gunnarmorling. :)
This produces incorrect results for the 10K keyset test (see create_measurements_3.sh). Note that we're after the cut-off time, you'll have two more changes you can make to this PR (see note at the top of the README). If it's not working or valid then, I'll have to close it unfortunately. Updates should be pushed quickly, so I can evaluate all the pending entries. Thx!
+ timeout -v 300 ./test.sh ChrisBellew measurements_10K_1B.txt
Validating calculate_average_ChrisBellew.sh -- measurements_10K_1B.txt
WARNING: Using incubator modules: jdk.incubator.vector
48c48
< -so;-15.7;15.1;45.6
---
> -so;-15.7;14.8;45.6
50c50
< -su;-25.0;7.0;46.0
---
> -su;-25.0;5.6;35.0
94c94
< Alot;-13.9;16.6;48.5
---
> Alot;-12.1;17.1;48.5
...
Thanks for your patience @gunnarmorling. I wasn't making good enough use of the test scripts with the different files.
I located the issue and fixed it. We should be good to go now!
Looking good now: 00:06.982.