great_expectations icon indicating copy to clipboard operation
great_expectations copied to clipboard

`InferredAssetSqlDataConnector` does not respect passed `introspection_directives`

Open jhanlonhg opened this issue 3 years ago • 6 comments

Describe the bug When trying to create an InferredAssetSqlDataConnector that takes advantage of the introspection_directives parameter in the InferredAssetSqlDataConnector object, no error is thrown, but the introspection_directives input is not respected.

To Reproduce Steps to reproduce the behavior:

  1. Run great_expectations --v3-api datasource new in an appropriate directory.
  2. Follow the normal workflow, and use this for the contents of the example.yaml code cell:
example_yaml = f"""
name: {datasource_name}
class_name: Datasource
execution_engine:
  class_name: SqlAlchemyExecutionEngine
  credentials:
    host: {host}
    username: {username}
    database: {database}
    query:
      schema: {schema}
      warehouse: {warehouse}
      role: {role}
    private_key_path: {private_key_path}
    private_key_passphrase: {private_key_passphrase}
    drivername: snowflake
data_connectors:
  default_runtime_data_connector_name:
    class_name: RuntimeDataConnector
    batch_identifiers:
      - default_identifier_name
  default_inferred_data_connector_name:
    class_name: InferredAssetSqlDataConnector
    introspection_directives:
        schema_name: {schema}"""
print(example_yaml)
  1. Then run the context.test_yaml_config(yaml_config=example_yaml) cell, the output for the default_inferred_data_connector_name shows data_assets from schemas outside of the {schema} specified in introspection_directives.
  2. Next, re-run the example_yaml cell with these contents:
example_yaml = f"""
name: {datasource_name}
class_name: SimpleSqlalchemyDatasource
credentials:
    host: {host}
    username: {username}
    database: {database}
    query:
      schema: {schema}
      warehouse: {warehouse}
      role: {role}
    private_key_path: {private_key_path}
    private_key_passphrase: {private_key_passphrase}
    drivername: snowflake
introspection:
  default_inferred_data_connector_name:
    introspection_directives:
        schema_name: {schema}"""
print(example_yaml)
  1. Now when we run the context.test_yaml_config(yaml_config=example_yaml) cell, only data_assets from {schema} are present.

Expected behavior When specifying introspection_directives on an Inferred Data Connector, those introspective directives are correctly consumed and respected (e.g., when setting an introspective_directives:schema_name:{value}, only tables from that {schema} appear in to that Data Connector).

Environment (please complete the following information):

  • Operating System: MacOS
  • Great Expectations Version: 0.13.37

Additional context This is my requirements.txt for the project:

great-expectations == 0.13.37
sqlalchemy == 1.4.25
snowflake-connector-python == 2.6.2
snowflake-sqlalchemy == 1.3.2
black == 21.9b0

This is the introspective_directives parameter I am attempting to utilize: https://github.com/great-expectations/great_expectations/blob/feacf843f6b9d439c77382e2cfd8c612cc4be53b/great_expectations/datasource/data_connector/inferred_asset_sql_data_connector.py#L58

jhanlonhg avatar Nov 09 '21 23:11 jhanlonhg

Thank you for opening this, @jhanlonhg! This is a known issue, and is the reason why we still use SimpleSqlAlchemyDatasource. We do have plans to make this work with the Datasource class, so I will keep this open for reference.

talagluck avatar Nov 22 '21 16:11 talagluck

Hey @jhanlonhg ! Reaching out to let you this is still in queue for our core engineering team, and we'll be discussing next steps in the near future. There has been a significant amount of motion in this area in recent weeks; are you able to confirm that you're still experiencing this behavior?

austiezr avatar Aug 05 '22 18:08 austiezr

Hey Austin,

Sorry for the late reply, but I have been out of office since your initial email.

Unfortunately, I am unaware if this is still an issue that our team is experiencing, as we are not performing active development of Great Expectations repos at this time.

However, if I remember this issue correctly, I would be surprised if it was resolved short of active developer intervention, as there was an issue where the setting in the yaml would never reach the appropriate code to be enacted.

Sorry I couldn't be more helpful

jhanlonhg avatar Aug 09 '22 16:08 jhanlonhg

This is still an issue that is causing my team very long run times as the introspection will scan way too many tables if the schema cannot be specified.

pdraznik-cadent avatar Sep 09 '22 19:09 pdraznik-cadent

Thanks @jhanlonhg and @pdraznik-cadent - this was addressed in the latest release (as of last Thursday). Can you please confirm whether this has been resolved for you?

talagluck avatar Sep 13 '22 10:09 talagluck

Thanks @talagluck this is working with the update and we're seeing greatly improved times! Are there any plans to provide additional filtering mechanisms for specific tables? This would help us to decrease our run times even further 😄

pdraznik-cadent avatar Sep 15 '22 22:09 pdraznik-cadent

Closing as resolved in 0.15.22

austiezr avatar Feb 02 '23 18:02 austiezr