ElSql icon indicating copy to clipboard operation
ElSql copied to clipboard

@LIKE does not work as expected in presence of escaped wildcards

Open MikeN123 opened this issue 4 years ago • 4 comments

It seems the @LIKE processing, where it generates an = comparison instead of a LIKE when possible, is broken in the presence of escaped wildcard characters.

If you provide test\_test as a parameter, ElSqlConfig.isLikeWildcard(value) will see the escaped wildcard, and return that it is not a 'like wildcard'. Consequently, LikeSqlFragment generates a comparison using the = operator. So you SQL ends up as:

SELECT * FROM x WHERE a = 'test\_test'

This searches for test\_test and not for test_test.

If you specify test_test, it will see the wildcard and generate:

SELECT * FROM x WHERE a LIKE 'test_test'

which matches more than it should. So whether you pass escaped or unescaped parameters, the end result is never what you want.

I am not sure what a good way to fix this would be. The only possible way seems to be to always generate a LIKE, or to let the LikeSqlFragment unescape the parameter if it does not generate a LIKE, but that feels like data it should not be touching.

MikeN123 avatar Jun 26 '20 06:06 MikeN123

I also thought about just removing the escape processing from isLikeWildcard(), but that does not help either. If you have \\ in your parameter value, that can either mean a \ (in case of a LIKE comparison), or \\ (in case of an = comparison). This ambiguity in the parameter input makes it that isLikeWildcard() is never able to infer what you meant.

MikeN123 avatar Jun 26 '20 06:06 MikeN123

IIRC @LIKE should behave:

  • a search for test should use = 'test'
  • a search for test_test should use LIKE 'test_test'
  • a search for test\_test should use = 'test_test' or LIKE test\_test
  • a search for test\_test_test should use LIKE 'test\_test_test'

The code cannot feasibly unescape as the text might contain more than just the search string.

PostgreSQL says that if the search string does not contain a wildcard it behaves like = anyway. Oracle seems to work that way too. Though MySql says there can be some differences in behaviour. So, it seems that the best approach might well be to remove the escape processing, and simply say that any wildcard character means that LIKE should be used. It is possible that it might break something somewhere, but I suspect most databases are like PostgreSQL and would handle it just fine.

Did you want to raise a PR? Otherwise, you could just use LIKE and avoid @LIKE.

jodastephen avatar Jun 26 '20 09:06 jodastephen

I think we will mostly switch to an explicit LIKE or =, as we want to force a = in some cases for performance reasons (see the remark at the end of the post).

I also think removing the escape processing is the only feasible option, that makes the behaviour more predictable. The only special case is then test\\test which could be interpreted as = 'test\test' or LIKE 'test\\test' (with \ as escape character). Do you want to default to LIKE there too? That way users can always feed escaped params and ElSql should do the correct thing.

I can try to find some time to raise a PR.

PostgreSQL says that if the search string does not contain a wildcard it behaves like = anyway.

That's not fully true in case of citext columns at least. It will not use the index when using citext. That is more a technical implementation detail, but can of course have a big performance impact.

MikeN123 avatar Jun 26 '20 09:06 MikeN123

I think test\test should be = but test\\test should be LIKE. ie. any match of \\, \_ or \% would case LIKE processing. Unfortunate, but probably the only practical solution.

jodastephen avatar Jun 26 '20 10:06 jodastephen