great_expectations icon indicating copy to clipboard operation
great_expectations copied to clipboard

[FEATURE] Add support for Vertica dialect

Open viplazylmht opened this issue 1 year ago • 8 comments

Add support to Vertica dialiect (using sqlalchemy).

Changes proposed in this pull request:

  • Specific the query "create temp table" for Vertica.
  • FIx build_batch_spec can't accept multiple arguments, eg. both query and create_temp_table.
  • Add vertica dependency to requirements.

Definition of Done

Please delete options that are not relevant.

  • [x] My code follows the Great Expectations style guide
  • [x] I have performed a self-review of my own code
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have added unit tests where applicable and made sure that new and existing tests are passing.
  • [ ] I have run any local integration tests and made sure that nothing is broken.

Thank you for submitting!

viplazylmht avatar Oct 11 '22 08:10 viplazylmht

Deploy request for niobium-lead-7998 pending review.

Visit the deploys page to approve it

Name Link
Latest commit aa22690a037ca91fa8f5e74ba846f5d76fdb83c3

netlify[bot] avatar Oct 11 '22 08:10 netlify[bot]

Why do we need to enhance the build_batch_spec?

To config some runtime batch requests like this:

from great_expectations.core.batch import RuntimeBatchRequest

validator = data_context.get_validator(
      batch_request=RuntimeBatchRequest(
        datasource_name="vertica",
        data_connector_name="default_runtime_data_connector_name",
        data_asset_name="stagging.TABLE_NAME",
        runtime_parameters={
            "query": f"SELECT * FROM TABLE_NAME limit 100",
            "create_temp_table": False,
            "temp_table_schema_name": "staging"
        },
        batch_identifiers={
            "default_identifier_name": "Test TABLE_NAME",
        },
      ),
      expectation_suite_name=dummy_expectation_suite_name,
    )

viplazylmht avatar Oct 11 '22 08:10 viplazylmht

This pull request also resolves the issue 3328.

viplazylmht avatar Oct 12 '22 08:10 viplazylmht

Howdy @viplazylmht thanks oodles for submitting this and https://github.com/great-expectations/great_expectations/issues/3328

You still need to follow the instructions from the cal-bot for that check to pass https://github.com/great-expectations/great_expectations/pull/6145#issuecomment-1274315389

AFineDayFor avatar Oct 12 '22 20:10 AFineDayFor

@cla-bot check

viplazylmht avatar Oct 13 '22 07:10 viplazylmht

@AFineDayFor Hello there, I have submitted the cla for both [email protected] and [email protected], and rechecked with cla-bot but it still failed. image

What should I do now? c.c @kyleaton

viplazylmht avatar Oct 13 '22 07:10 viplazylmht

Thank you so much for this contribution @viplazylmht! I'll take a look and get back to you with a review.

anthonyburdi avatar Oct 14 '22 13:10 anthonyburdi

@anthonyburdi Thanks for your help.

We still have some issues, when the create table as {query} statement can not work with a query returning complex types column (checkout the docs). But I think this is the worst case, and we can have a workaround solution when someone addresses it.

I also tested some expectations and noticed that expectations related to column data type (e.g expect_column_values_to_be_of_type) will raise errors because the ExecutionEngine can not get the data type of the table (or temp table), then lead the data passed to the expectation missing 'type' key. I don't know where we retrieve data types of columns using sqlalchemy, and why Vertica differs from another dialect, so I can't fix this for now.

viplazylmht avatar Oct 18 '22 02:10 viplazylmht

Thanks so much @viplazylmht! I think this PR adds considerable new functionality, and if you are keen then addressing the issues you described in future contributions would be greatly appreciated. Adding documentation for Vertica is also another big help if that is something you are interested in, consider the Connect to Data "Database" and/or Reference Architectures sections.

~~I did check our CLA bot registry and I don't see [email protected] listed, would you mind trying again to sign the CLA with that email address?~~ Nevermind, Kyle fixed it - merging!

anthonyburdi avatar Oct 18 '22 14:10 anthonyburdi

@cla-bot check

kyleaton avatar Oct 18 '22 18:10 kyleaton