great_expectations
great_expectations copied to clipboard
[FEATURE] Add support for Vertica dialect
Add support to Vertica dialiect (using sqlalchemy).
Changes proposed in this pull request:
- Specific the query "create temp table" for Vertica.
- FIx
build_batch_spec
can't accept multiple arguments, eg. bothquery
andcreate_temp_table
. - Add vertica dependency to requirements.
Definition of Done
Please delete options that are not relevant.
- [x] My code follows the Great Expectations style guide
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have added unit tests where applicable and made sure that new and existing tests are passing.
- [ ] I have run any local integration tests and made sure that nothing is broken.
Thank you for submitting!
Deploy request for niobium-lead-7998 pending review.
Visit the deploys page to approve it
Name | Link |
---|---|
Latest commit | aa22690a037ca91fa8f5e74ba846f5d76fdb83c3 |
Why do we need to enhance the build_batch_spec
?
To config some runtime batch requests like this:
from great_expectations.core.batch import RuntimeBatchRequest
validator = data_context.get_validator(
batch_request=RuntimeBatchRequest(
datasource_name="vertica",
data_connector_name="default_runtime_data_connector_name",
data_asset_name="stagging.TABLE_NAME",
runtime_parameters={
"query": f"SELECT * FROM TABLE_NAME limit 100",
"create_temp_table": False,
"temp_table_schema_name": "staging"
},
batch_identifiers={
"default_identifier_name": "Test TABLE_NAME",
},
),
expectation_suite_name=dummy_expectation_suite_name,
)
This pull request also resolves the issue 3328.
Howdy @viplazylmht thanks oodles for submitting this and https://github.com/great-expectations/great_expectations/issues/3328
You still need to follow the instructions from the cal-bot for that check to pass https://github.com/great-expectations/great_expectations/pull/6145#issuecomment-1274315389
@cla-bot check
@AFineDayFor Hello there, I have submitted the cla for both [email protected]
and [email protected]
, and rechecked with cla-bot but it still failed.
What should I do now? c.c @kyleaton
Thank you so much for this contribution @viplazylmht! I'll take a look and get back to you with a review.
@anthonyburdi Thanks for your help.
We still have some issues, when the create table as {query}
statement can not work with a query returning complex types column (checkout the docs). But I think this is the worst case, and we can have a workaround solution when someone addresses it.
I also tested some expectations and noticed that expectations related to column data type
(e.g expect_column_values_to_be_of_type
) will raise errors because the ExecutionEngine can not get the data type of the table (or temp table), then lead the data passed to the expectation missing 'type' key. I don't know where we retrieve data types of columns using sqlalchemy, and why Vertica differs from another dialect, so I can't fix this for now.
Thanks so much @viplazylmht! I think this PR adds considerable new functionality, and if you are keen then addressing the issues you described in future contributions would be greatly appreciated. Adding documentation for Vertica is also another big help if that is something you are interested in, consider the Connect to Data "Database" and/or Reference Architectures sections.
~~I did check our CLA bot registry and I don't see [email protected]
listed, would you mind trying again to sign the CLA with that email address?~~ Nevermind, Kyle fixed it - merging!
@cla-bot check