comdb2 icon indicating copy to clipboard operation
comdb2 copied to clipboard

Fix incorrect index position caused by loss of precision

Open riverszhang89 opened this issue 2 years ago • 1 comments

Consider the following statements.

CREATE TABLE t (i int, j int)
CREATE INDEX t_ij on t(i, j)
INSERT INTO t VALUES (1.1, 1), (1.1, 2), (2, 3)
SELECT * FROM t WHERE i = 1.1 AND j = 1

The last SELECT statement returns wrong rows, as shown below:

(i=1, j=1)
(i=1, j=2)

To understand the issue, first consider the following statement.

SELECT * FROM t WHERE i = 1 AND j = 1

The cursor on the index is first positioned on the smallest entry that's greater than or equal to (1, 1), that is, the very 1st row in this table. We then continue traversing the index till we see an entry greater than the search key. Hence we stop at (1, 2), returning (1, 1) to client.

However, when searching (1.1, 1), the floating point number is rounded down to 1 (just like how it was inserted). Therefore, instead of being positioned on (2, 3), the cursor is placed on (1, 1), again. Now, we're going to traverse the index till we meet an entry greater than (1.1, 1), that is, (2, 3) (table has (1, 1), (1, 2) and (2, 3)).

To fix this, a new roundup conversion flag is introduced. When lit, mem_to_ondisk, instead of rounding the given value down, it rounds it up to the next higher value in its lower precsion data type. For example, 1.1 would be rounded up to 2; "2023-01-01T000000.000001" would be rounded up to "2023-01-01T000000.001".

(DRQS 171250145)

Signed-off-by: Rivers Zhang [email protected]

riverszhang89 avatar Jan 26 '23 18:01 riverszhang89

See also https://github.com/bloomberg/comdb2/pull/3643 -- seems to fix simple case with int (or datetime) index and search by float or datetime_us.

akshatsikarwar avatar Jan 26 '23 19:01 akshatsikarwar