great_expectations
great_expectations copied to clipboard
`row_condition` with SQLAlchemy not working as documented
Describe the bug
The documentation states that row conditions for SQL should be specified like this: row_condition='col("foo") != "a-b"'
So, in Jupiter Notebooks I run the following expectation:
validator.expect_column_values_to_not_be_null(
column='network_id',
condition_parser='great_expectations__experimental__',
row_condition='col("source_id") != 1'
)
but it fails with the error: unable to parse condition
. I have also tried these ways, but neither work:
unable to parse condition: col("source_id").not_in([1])
unable to parse condition: col("source_id") != 1
unable to parse condition: col("source_id") <> 1
unable to parse condition: ~col("source_id") == 1
Something similar was recently raised by @matthiasgomolka
Environment (please complete the following information):
- Operating System: MacOS 13.6 (22G120)
- Great Expectations Version: 0.15.42
- Data Source: Snowflake (SQLAlchemy)
Hi @nenkie76 thank you for letting us know, this was expressed in the previous issue https://github.com/great-expectations/great_expectations/issues/8847. There is a solution that he provides in that issue. Please take a look. We'll put this in backlog in the meantime
@HaebichanGX , parallel thread is about Spark and the way of writing an expression, but this one is only about !=
operator which might no be supported. As I understand it comes from _parse_great_expectations_condition()
here, but I had no time yet to debug the root cause.
PLEEEEASE get rid of this row_condition='col("foo").notNull()'
and allow simple SQL syntax passthru: row_condition = 'fld1=5 OR fld2<>7 AND fld3 <9'
you may use this - it works like a SQL WHERE:
"condition_parser": "spark", "row_condition": "date_of_birth > DATE "0001-01-01" AND gender = 'M' OR dead = true",