great_expectations
great_expectations copied to clipboard
`InferredAssetSqlDataConnector` does not respect passed `introspection_directives`
Describe the bug
When trying to create an InferredAssetSqlDataConnector
that takes advantage of the introspection_directives
parameter in the InferredAssetSqlDataConnector
object, no error is thrown, but the introspection_directives
input is not respected.
To Reproduce Steps to reproduce the behavior:
- Run
great_expectations --v3-api datasource new
in an appropriate directory. - Follow the normal workflow, and use this for the contents of the
example.yaml
code cell:
example_yaml = f"""
name: {datasource_name}
class_name: Datasource
execution_engine:
class_name: SqlAlchemyExecutionEngine
credentials:
host: {host}
username: {username}
database: {database}
query:
schema: {schema}
warehouse: {warehouse}
role: {role}
private_key_path: {private_key_path}
private_key_passphrase: {private_key_passphrase}
drivername: snowflake
data_connectors:
default_runtime_data_connector_name:
class_name: RuntimeDataConnector
batch_identifiers:
- default_identifier_name
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
introspection_directives:
schema_name: {schema}"""
print(example_yaml)
- Then run the
context.test_yaml_config(yaml_config=example_yaml)
cell, the output for thedefault_inferred_data_connector_name
showsdata_assets
fromschemas
outside of the{schema}
specified inintrospection_directives
. - Next, re-run the
example_yaml
cell with these contents:
example_yaml = f"""
name: {datasource_name}
class_name: SimpleSqlalchemyDatasource
credentials:
host: {host}
username: {username}
database: {database}
query:
schema: {schema}
warehouse: {warehouse}
role: {role}
private_key_path: {private_key_path}
private_key_passphrase: {private_key_passphrase}
drivername: snowflake
introspection:
default_inferred_data_connector_name:
introspection_directives:
schema_name: {schema}"""
print(example_yaml)
- Now when we run the
context.test_yaml_config(yaml_config=example_yaml)
cell, onlydata_assets
from{schema}
are present.
Expected behavior
When specifying introspection_directives
on an Inferred Data Connector, those introspective directives are correctly consumed and respected (e.g., when setting an introspective_directives:schema_name:{value}
, only tables from that {schema}
appear in to that Data Connector).
Environment (please complete the following information):
- Operating System: MacOS
- Great Expectations Version: 0.13.37
Additional context
This is my requirements.txt
for the project:
great-expectations == 0.13.37
sqlalchemy == 1.4.25
snowflake-connector-python == 2.6.2
snowflake-sqlalchemy == 1.3.2
black == 21.9b0
This is the introspective_directives
parameter I am attempting to utilize: https://github.com/great-expectations/great_expectations/blob/feacf843f6b9d439c77382e2cfd8c612cc4be53b/great_expectations/datasource/data_connector/inferred_asset_sql_data_connector.py#L58
Thank you for opening this, @jhanlonhg! This is a known issue, and is the reason why we still use SimpleSqlAlchemyDatasource
. We do have plans to make this work with the Datasource
class, so I will keep this open for reference.
Hey @jhanlonhg ! Reaching out to let you this is still in queue for our core engineering team, and we'll be discussing next steps in the near future. There has been a significant amount of motion in this area in recent weeks; are you able to confirm that you're still experiencing this behavior?
Hey Austin,
Sorry for the late reply, but I have been out of office since your initial email.
Unfortunately, I am unaware if this is still an issue that our team is experiencing, as we are not performing active development of Great Expectations repos at this time.
However, if I remember this issue correctly, I would be surprised if it was resolved short of active developer intervention, as there was an issue where the setting in the yaml would never reach the appropriate code to be enacted.
Sorry I couldn't be more helpful
This is still an issue that is causing my team very long run times as the introspection will scan way too many tables if the schema cannot be specified.
Thanks @jhanlonhg and @pdraznik-cadent - this was addressed in the latest release (as of last Thursday). Can you please confirm whether this has been resolved for you?
Thanks @talagluck this is working with the update and we're seeing greatly improved times! Are there any plans to provide additional filtering mechanisms for specific tables? This would help us to decrease our run times even further 😄
Closing as resolved in 0.15.22