dolt icon indicating copy to clipboard operation
dolt copied to clipboard

Unexpected results when comparing string with type conversion

Open suyZhong opened this issue 1 year ago • 3 comments

Considering the test case below:

CREATE TABLE t0(c1 VARCHAR(500), PRIMARY KEY(c1)); -- PK is needed to reproduce the issue
INSERT INTO t0 (c1) VALUES ('a');

SELECT * FROM t0; -- 'a'
SELECT (t0.c1 <= 1) FROM t0; -- 1
SELECT * FROM t0 WHERE (t0.c1 <= 1);
-- Expected: 'a'
-- Actual: Empty

The third SELECT returns an empty result, which is surprising: If the result of second query is 1, the value of the <= expression should be 1, and thus the third query should return the row in t0.

I originally found this by building dolt from source version 9360c4e199f900ccd56a58f7355bdacb4552561d

suyZhong avatar Jan 19 '24 07:01 suyZhong

Test for repro:

		{
			Name: "range query convert int to string zero value",
			SetUpScript: []string{
				`CREATE TABLE t0(c1 VARCHAR(500), PRIMARY KEY(c1)); -- PK is needed to reproduce the issue`,
				`INSERT INTO t0 (c1) VALUES ('a');`,
			},
			Assertions: []queries.ScriptTestAssertion{
				{
					Query: "SELECT * FROM t0 WHERE (t0.c1 <= 1);",
					Expected: []sql.Row{{"a"},
				},
			},
		},

Buggy plan:

Filter
 ├─ LessThanOrEqual
 │   ├─ t0.c1:0!null
 │   └─ 1 (tinyint)
 └─ IndexedTableAccess(t0)
     ├─ index: [t0.c1]
     ├─ static: [{(NULL, 1]}]
     ├─ colSet: (1)
     ├─ tableId: 1
     └─ Table
         ├─ name: t0
         └─ columns: [c1]

It looks like we expect "a" to convert to the integer zero value for comparison to 1, but the index scan is only compatible with the opposite transformation, 1 to the zero string type, which is not logically equivalent.

max-hoffman avatar Jan 22 '24 18:01 max-hoffman

Update here.

When converting between types for our indexes, we pick the type that the column in the key is defined over. It appears that MySQL has different rules for these conversions.

We are still looking into how to fix this.

jycor avatar Feb 20 '24 21:02 jycor

We discussed this one some more last week. Looks like it comes down to Dolt/GMS not following MySQL's type conversion rule order. From https://dev.mysql.com/doc/refman/8.0/en/type-conversion.html :

In all other cases, the arguments are compared as floating-point (double-precision) numbers. For example, a comparison of string and numeric operands takes place as a comparison of floating-point numbers.

I think we have this logic encoded for other expressions, but we don't use the same rules for index range scans.

fulghum avatar Mar 12 '24 04:03 fulghum