great_expectations icon indicating copy to clipboard operation
great_expectations copied to clipboard

Azure Synapse issue with table level expectations

Open dollyBa opened this issue 1 year ago • 2 comments

Hi,

I am trying to use Great Expectations on Azure Synapse Workspace with data source as Azure synapse dedicated SQL pool. I am running into issues when trying to run an expectation suite. The error is as follows:

ge_exceptions.MetricResolutionError(great_expectations.exceptions.exceptions.MetricResolutionError: 'NoneType' object is not iterable)

Note: I am getting this error only for column level expectations. Table level expectations are working fine.

Expected behavior Ideally, both table and column level expectations should run without errors.

Environment:

  • Azure Synapse Analytics
  • Great Expectations Version: 0.15.25

Additional context I tried running the same thing with data source as Azure SQL, it worked fine. The issue is coming only for dedicated Synapse SQL data.

dollyBa avatar Oct 14 '22 05:10 dollyBa

Howdy @dollyBa :wave: thanks for raising this with us and being a part of this lovely community :bow:

Do you happen to have a full stack trace and can you provide the workflow/configuration that we can reference further? :microscope:

AFineDayFor avatar Oct 14 '22 15:10 AFineDayFor

Hey,

Please find below workflow/configuration for further reference:

Below is the configuration used to set up the BaseDataContext:

data_context_config = DataContextConfig(
    config_version=2,
    plugins_directory=None,
    config_variables_file_path=None,
    datasources={
          "my_spark_datasource_config": DatasourceConfig(
      class_name="Datasource",
      
      execution_engine={
        "class_name": "SqlAlchemyExecutionEngine",
        "module_name": "great_expectations.execution_engine",
        "connection_string": "mssql+pyodbc://<user_name>:<password>@<server_name>/<database_name>?driver=ODBC Driver 17 for SQL Server&charset=utf&autocommit=true"
        },
      data_connectors={
        "default_runtime_data_connector_name": {
          "class_name": "RuntimeDataConnector",
          "module_name": "great_expectations.datasource.data_connector",
          "batch_identifiers": "default_identifier_name"
        },
        "default_inferred_data_connector_name": {
          "class_name": "InferredAssetSqlDataConnector",
          "module_name": "great_expectations.datasource.data_connector",
          "introspection_directives": {"schema_name": "<schema_name>"},
          "include_schema_name": "true"
        },
        "default_configured_data_connector_name": {
          "class_name": "ConfiguredAssetSqlDataConnector",
          "module_name": "great_expectations.datasource.data_connector",
          "assets": {"<table_name>":{"class_name": "Asset", "module_name": "great_expectations.datasource.data_connector.asset","schema_name": "<schema_name>"}},
          "force_reuse_spark_context": "True",
        }
    }
  )
    },
    stores={
      "expectations_SQL_store": {
            "class_name": "ExpectationsStore",
            "store_backend": {
              "class_name": "DatabaseStoreBackend",
              "connection_string": f"mssql+pyodbc://<db_user>:<db_pass>@<db_servername>/<db_name>?driver=ODBC Driver 17 for SQL Server"
            },
        },
        "validations_SQL_store": {
            "class_name": "ValidationsStore",
            "store_backend": {
              "class_name": "DatabaseStoreBackend",
              "connection_string": f"mssql+pyodbc://<db_user>:<db_pass>@<db_servername>/<db_name>?driver=ODBC Driver 17 for SQL Server"
            },
        },
        "evaluation_parameter_store": {"class_name": "EvaluationParameterStore"},
    },
    expectations_store_name="expectations_SQL_store",
    validations_store_name="validations_SQL_store",
    evaluation_parameter_store_name="evaluation_parameter_store",
    validation_operators={
        "action_list_operator": {
            "class_name": "ActionListValidationOperator",
            "action_list": [
                {
                    "name": "store_validation_result",
                    "action": {"class_name": "StoreValidationResultAction"},
                },
                {
                    "name": "store_evaluation_params",
                    "action": {"class_name": "StoreEvaluationParametersAction"},
                },
                {
                    "name": "update_data_docs",
                    "action": {"class_name": "UpdateDataDocsAction"},
                },
            ],
        }
    },
    anonymous_usage_statistics={
      "enabled": True
    }
)

context = BaseDataContext(project_config=data_context_config)

and following is the skeleton of how i have created the validator object:

suite = context.create_expectation_suite(expectation_suite_name=expectation_suite_name)

#Set the data source
batch_request=RuntimeBatchRequest(
              datasource_name="my_spark_datasource_config",
              data_connector_name="default_runtime_data_connector_name",
              data_asset_name="my_data_asset_name",
              runtime_parameters={
                  "query": <query>
              },
              batch_identifiers={"d": "batch_run_id"},
              batch_spec_passthrough= {
                "create_temp_table": False
            }
          )

validator = context.get_validator(
        batch_request=batch_request,
        expectation_suite_name=expectation_suite_name
      )

dollyBa avatar Oct 16 '22 12:10 dollyBa

Hey @AFineDayFor, do let me know if you need any further details from my end.

dollyBa avatar Oct 20 '22 05:10 dollyBa