great_expectations icon indicating copy to clipboard operation
great_expectations copied to clipboard

Can I use Great Expectation to validate Pinot OLAP database ?

Open INNOCENT-BOY opened this issue 2 years ago • 2 comments

Is your feature request related to a problem? Please describe. I want to use great expectations to validate Pinot OLAP database, And I tried to connect to Pinot OLAP database via SQLAlchemy. And I can acquire table names in Pinot OLAP. But I can't acquire a batch of data.

[my think] I see schema term in Pinot is different from great expectations. See below pic 3. FROM callanalyzer_result_metrics_kevin."airlineStats In Pinot, we can't use this format. callanalyzer_result_metrics_kevin is just schema, and airlineStats is the table name. And table and schema are separate part.

Describe the solution you'd like If Great expectations can validate, which tutorial can I learn? If great expectations can't support validate Pinot OLAP database, how would I develop this new feature?(relative classes etc)

Additional context image image image

Add any other context or screenshots about the feature request here. @mmetzger @sotte @eugmandel @afeld @lcorneliussen

INNOCENT-BOY avatar Apr 13 '22 07:04 INNOCENT-BOY

Hi @INNOCENT-BOY - thanks for raising this. Great Expectations does not official support the Pinot OLAP database. Given that Great Expectations uses SQLAlchemy on the back-end, it's possible that you may get some of the functionality to work despite Pinot not being supported.

Some of the common places for failure with unsupported SQL dialects are temp table creation, and then specific Expectation logic (e.g. for regex or complex statistical expectations). You can try turning off temp table creation in your example by adding the following to your BatchRequest: batch_spec_passthrough={"create_temp_table": False}

It's great to hear that you are interested in contributing support for Pinot OLAP to Great Expectations! We are currently working on a guide for the right way to do this, which should be available in a few weeks. In the meantime, you can take a look at the SqlAlchemyExecutionEngine class, and the specific SQLAlchemy implementations in our Metrics (e.g. here).

talagluck avatar Apr 13 '22 18:04 talagluck

@talagluck Thanks for your prompt reply. I have got the desired solution. I will focus on this feature in the future. And discuss with my team whether this feature needs to be developed right now. Looking forward to your guide on how to contribute to the community.

INNOCENT-BOY avatar Apr 14 '22 01:04 INNOCENT-BOY