great_expectations icon indicating copy to clipboard operation
great_expectations copied to clipboard

`row_condition` with SQLAlchemy not working as documented

Open nenkie76 opened this issue 1 year ago • 2 comments

Describe the bug The documentation states that row conditions for SQL should be specified like this: row_condition='col("foo") != "a-b"'

So, in Jupiter Notebooks I run the following expectation:

validator.expect_column_values_to_not_be_null(
    column='network_id', 
    condition_parser='great_expectations__experimental__', 
    row_condition='col("source_id") != 1'
)

but it fails with the error: unable to parse condition. I have also tried these ways, but neither work:

unable to parse condition: col("source_id").not_in([1])
unable to parse condition: col("source_id") != 1
unable to parse condition: col("source_id") <> 1
unable to parse condition: ~col("source_id") == 1

Something similar was recently raised by @matthiasgomolka

Environment (please complete the following information):

  • Operating System: MacOS 13.6 (22G120)
  • Great Expectations Version: 0.15.42
  • Data Source: Snowflake (SQLAlchemy)

nenkie76 avatar Oct 19 '23 15:10 nenkie76

Hi @nenkie76 thank you for letting us know, this was expressed in the previous issue https://github.com/great-expectations/great_expectations/issues/8847. There is a solution that he provides in that issue. Please take a look. We'll put this in backlog in the meantime

HaebichanGX avatar Oct 24 '23 14:10 HaebichanGX

@HaebichanGX , parallel thread is about Spark and the way of writing an expression, but this one is only about != operator which might no be supported. As I understand it comes from _parse_great_expectations_condition() here, but I had no time yet to debug the root cause.

nenkie76 avatar Oct 25 '23 20:10 nenkie76

PLEEEEASE get rid of this row_condition='col("foo").notNull()'

and allow simple SQL syntax passthru: row_condition = 'fld1=5 OR fld2<>7 AND fld3 <9'

kujaska avatar Jul 05 '24 14:07 kujaska

you may use this - it works like a SQL WHERE:

"condition_parser": "spark", "row_condition": "date_of_birth > DATE "0001-01-01" AND gender = 'M' OR dead = true",

kujaska avatar Aug 20 '24 06:08 kujaska