opteryx icon indicating copy to clipboard operation
opteryx copied to clipboard

💰 The Comparison Operator assumes elementwise comparisons

Open joocer opened this issue 3 years ago • 1 comments
trafficstars

We spend time and memory building lists to compare against, and then discard all except the first entry.

Example:

    elif operator == "NotLike":
        # MODIFIED FOR OPTERYX - see comment above
        _check_type("NOT LIKE", identifier_type, (TOKEN_TYPES.VARCHAR))
        matches = compute.match_like(arr, value[0])
        matches = compute.fill_null(matches, True)
        return numpy.invert(matches)

The value list only uses the first item. LIKE, ILIKE etc are never elementwise, so we don't need to waste time and memory building the list.

joocer avatar Aug 07 '22 21:08 joocer

I think the solution should be for the expression engine to not expand LITERALS to the number of rows in the table, this will have two consequences:

  • if it is being used to add a column to a table, this should detect scalars and expand them
  • if being used in functions that are elementwise, this will need to expand the list

joocer avatar Aug 18 '22 22:08 joocer