hollow
hollow copied to clipboard
Fix ArrayIndexOutOfBoundsException in primary key index
- [x] reproduce #399
- [ ] determine root cause and fix
The issue is not with the primary key index but with what is considered an absent value (referred to as a null value).
Even for values (not references) such as the value of a long
field an absence of the value is encoded and a certain bit pattern has to represent the absence.
When writing out state the maximum number of bits required to encode a field is computed and encoded, which is the maximum of the number of bits to encode each value. More specifically for an integral value (such as a long
value) it is the number of bits required to encode the zig zag transformation of the value (https://en.wikipedia.org/wiki/Variable-length_quantity#Zigzag_encoding).
In summary, the maximum number of bits of an integral value is the result of the expression 64 - Long.numberOfLeadingZeros(zigZag(value) + 1)
The absent integral value is chosen to be (1L << maxBits) - 1
. When such a value is read then the minimum value is returned e.g for long
it is Long.MIN_VALUE
.
The problem encountered is that the maximum number of bits for the TypeA.id1
field for values { 9223372034562340851L, 0L, 3L }
is 64 bits and (1L << 64) -1
is 0
, since the 1L << 64
is the same as 1L << 0
which is 1
. Therefore the bit pattern for encoding an absent value for this field is 0
.
The result of new GenericHollowObject(readStateEngine, "TypeA", 1).toString()
is:
a1: null
a2: Beasts of No Nation
a3: 2015
It is assumed the primary key index ignores absent values, hence the object for an id of 0
cannot be found.
I don't yet know what can be done about this. My advice for now would be to restrict long values to a maximum of 62 bits.
Thanks for the detailed explanation, @PaulSandoz!
When you say:
My advice for now would be to restrict long values to a maximum of 62 bits.
Do you mean, restrict long values to a maximum of 62 bits if lookup with 0 indexed key is necessary? Or restrict long values to a maximum of 62 bits full stop?
If I understand the explanation, it sounds like 0 should be the only problematic value. Am I missing something?
@jkade yes, you can avoid a 0
value if you wish. I was being conservative suggesting that you restrict the maximum bit size thereby you don't need to think about particular values and their meaning with regards to absence (e.g. use an int
field instead if you can). Note that Long.MIN_VALUE
is the token absent value for long
fields used at the API level.