Fix handling of unsigned arithmetic.
The existing implementation unfortunately falls prey to Java signed byte arithmetic in a couple of places. The hash of almost any odd-length buffer involving trailing 0xff bytes, for example, will not match the SipHash reference C implementation. This problem also makes it easy to generate collisions.
This pull request applies the typical Java unsigned-math workaround where necessary. The output now matches the reference implementation in my testing.
Apologies for the couple of accidental whitespace changes.
I haven't had time to add a test case myself, unfortunately, that specifically highlights this problem. Should be straightforward to write, though (see the note above regarding trailing 0xff bytes, for example).
That said: I'd suggest taking a look at this. The existing implementation works for simple test cases, but will often compute incorrect results on real data. (It may even fail to prevent hash flooding, but I haven't convinced myself of that.)
I found the same problem. Note: the length can not be negative, so the change "((long) data.length) & 0xff" is not needed. It is unfortunate the that official SipHash test vectors doesn't include randomly generated data...
By the way, you may also want to fix "b[0] = (byte) (l & 0xff);" and so on (for those cases, the "& 0xff" is not needed).
Good point. Reverted the length change and some whitespace noise. Leaving the b[0] code alone for this PR.