Improve numeric matching
What is your idea?
For Quamina, a couple of folks figured how to represent the whole range of 64-bit float values in 64 big-endian bits, and then to encode them in base128, then to discard certain suffixes. Check out numbits.go https://github.com/timbray/quamina/blob/main/numbits.go
So you get a smaller size representation of numeric field values, and no more subsetting of the the numbers that can be matched.
I can't see any reason this wouldn't work in Ruler .
Would you be willing to make the change?
I seem to recall that Rishi just wired in a similar flavor of change, so I suggest this one would be easy for him.
oh that's neat!
This should be doable within ruler though I won't be able to pick up for few weeks due to an internal launch.
Placing down links of the files I'd expect we'd need to touch to enforce this https://github.com/aws/event-ruler/commit/d04e3f03bd68d583cd7e01d9e174ad8ff949ac7b#diff-58bfacbbf2f6f6e26165ed131f0cae9667cf41d642d57b1a154ca97236507bce.
To keep codebases roughly similar, I'll try to imitate numbits.go as much as Java lets me.
BTW I wrote a blog about it at https://www.tbray.org/ongoing/When/202x/2024/08/28/Q-Numbers-2 and Arne Hoffman has promised to write an explanation of bit-masking voodoo, will put a pointer in here when I see it. Off the top of my head, I don't think Java should get in the way, although I'm not sure there's an equivalent of Go's math.Float64bits(f). It’s a little weird because the byte values are between 0 and 127 inclusive, a lot of which are not printable characters even though they are valid UTF-8. In Quamina we do a little extra work to shorten the 10-byte results where possible but I think that's going to screw up the horrible Range model-building logic; shouldn't make it impossible but it will have to be modified. If it were me I might decide to leave it at 10 bytes just to avoid that work.
Thanks Tim. Cursory check points me to Double.doubleToLongBits and related functions. I haven't had the time to explore how these methods behaves across various scenarios but hopefully it good enough for ruler's needs.