trino
trino copied to clipboard
Elastic search plugin support for pushdown TopN and Aggregation
Description
Additional context and related issues
This PR provides the support pushing down TopN and Aggregation function to elasticsearch.
( ) This is not user-visible or docs only and no release notes are required. ( x) Release notes are required, please propose a release note for me. ( ) Release notes are required, with the following suggested text:
# Fixes
- #12381
- #7026
- #4803
Following for updates. Elastic is a widely used data store and it has very poor sql support.
@hashhar @kokosing , Can you please review this merge request
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
:wave: @murthy-chelankuri @martint @bitsondatadev - this PR has become inactive. If you're still interested in working on it, please let us know.
Also fyi .. thanks to the work of @wendigo we now also have an OpenSearch connector .. so the changes could be ported there as well.
We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
@mosabua sorry for the delayed response. Yes I would like these changes merged. Please let me know about the next steps on this.
Can you check out the test failures and ensure they at least do not occur on your own setup.
@mosabua I made a few changes, and now the tests are successful without any issues. However, the pipeline is failing even for the build, although both the build and tests worked well on my local machine.
@murthy-chelankuri the code doesn't compile. Have you rebased it on master? I guess there are some logical merge conflicts
@wendigo Yes, I rebased with master. It's surprising that everything is working fine locally. Could there be an issue with the pipeline?
@murthy-chelankuri definitely not. I guess that your local setup is invalid rather then our pipelines which runs several hundred times a week.
This branch is not rebased:
commit e1dc146f2409447318c47a0ebf1167fa4c315942
Author: Martin Traverso <[email protected]>
Date: Mon Sep 25 09:21:59 2023 -0700
Avoid exponential DFA in LIKE matcher
When a pattern contains a % and a sequence of _ somewhere
after it, it may produce a DFA with exponential number of states.
In such cases, fall back to the NFA matcher.
it's based on commit from 6 months ago. In the meantime, elasticsearch client was updated.
After rebasing I'm getting the same compilation errors as in the CI:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project trino-elasticsearch: Compilation failure: Compilation failure:
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[37,57] package org.elasticsearch.search.aggregations.metrics.avg does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[38,57] package org.elasticsearch.search.aggregations.metrics.max does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[39,57] package org.elasticsearch.search.aggregations.metrics.min does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[40,57] package org.elasticsearch.search.aggregations.metrics.sum does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[41,64] package org.elasticsearch.search.aggregations.metrics.valuecount does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/AggregateQueryPageSource.java:[33,59] package org.elasticsearch.search.aggregations.metrics.stats does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/AggregateQueryPageSource.java:[299,45] cannot find symbol
@mosabua now the tests are running without any failures can you help me to get the changes at the earliest.
Can you please squash the commits and ensure the build actually passes. Currently CI is still failing.
@mosabua Thank you for the information. I have squashed the commits and the pipeline has passed all checks.
Can you change the commit message to
Add support for TopN and aggregation pushdown
@murthy-chelankuri
I will see if @wendigo @ebyhr or others are available for review help.
Can you change the commit message to
Add support for TopN and aggregation pushdown
@murthy-chelankuri
I will see if @wendigo @ebyhr or others are available for review help.
Changed the commit message as per your suggestion
@mosabua can you help to prioritize the code review for this PR before we lose traction?
Looking forward to this feature!
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.