trino icon indicating copy to clipboard operation
trino copied to clipboard

Elastic search plugin support for pushdown TopN and Aggregation

Open murthy-chelankuri opened this issue 1 year ago • 7 comments

Description

Additional context and related issues

This PR provides the support pushing down TopN and Aggregation function to elasticsearch.

( ) This is not user-visible or docs only and no release notes are required. ( x) Release notes are required, please propose a release note for me. ( ) Release notes are required, with the following suggested text:


# Fixes
  • #12381
  • #7026
  • #4803

murthy-chelankuri avatar Apr 06 '23 18:04 murthy-chelankuri

Following for updates. Elastic is a widely used data store and it has very poor sql support.

cploonker avatar Jul 10 '23 20:07 cploonker

@hashhar @kokosing , Can you please review this merge request

murthy-chelankuri avatar Sep 25 '23 20:09 murthy-chelankuri

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Jan 16 '24 17:01 github-actions[bot]

:wave: @murthy-chelankuri @martint @bitsondatadev - this PR has become inactive. If you're still interested in working on it, please let us know.

Also fyi .. thanks to the work of @wendigo we now also have an OpenSearch connector .. so the changes could be ported there as well.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

mosabua avatar Jan 16 '24 19:01 mosabua

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Feb 09 '24 17:02 github-actions[bot]

@mosabua sorry for the delayed response. Yes I would like these changes merged. Please let me know about the next steps on this.

murthy-chelankuri avatar Feb 10 '24 20:02 murthy-chelankuri

Can you check out the test failures and ensure they at least do not occur on your own setup.

mosabua avatar Feb 13 '24 00:02 mosabua

@mosabua I made a few changes, and now the tests are successful without any issues. However, the pipeline is failing even for the build, although both the build and tests worked well on my local machine.

murthy-chelankuri avatar Feb 19 '24 14:02 murthy-chelankuri

@murthy-chelankuri the code doesn't compile. Have you rebased it on master? I guess there are some logical merge conflicts

wendigo avatar Feb 19 '24 14:02 wendigo

@wendigo Yes, I rebased with master. It's surprising that everything is working fine locally. Could there be an issue with the pipeline?

murthy-chelankuri avatar Feb 19 '24 14:02 murthy-chelankuri

@murthy-chelankuri definitely not. I guess that your local setup is invalid rather then our pipelines which runs several hundred times a week.

wendigo avatar Feb 19 '24 14:02 wendigo

This branch is not rebased:

commit e1dc146f2409447318c47a0ebf1167fa4c315942
Author: Martin Traverso <[email protected]>
Date:   Mon Sep 25 09:21:59 2023 -0700

    Avoid exponential DFA in LIKE matcher

    When a pattern contains a % and a sequence of _ somewhere
    after it, it may produce a DFA with exponential number of states.
    In such cases, fall back to the NFA matcher.

it's based on commit from 6 months ago. In the meantime, elasticsearch client was updated.

wendigo avatar Feb 19 '24 14:02 wendigo

After rebasing I'm getting the same compilation errors as in the CI:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project trino-elasticsearch: Compilation failure: Compilation failure:
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[37,57] package org.elasticsearch.search.aggregations.metrics.avg does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[38,57] package org.elasticsearch.search.aggregations.metrics.max does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[39,57] package org.elasticsearch.search.aggregations.metrics.min does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[40,57] package org.elasticsearch.search.aggregations.metrics.sum does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchQueryBuilder.java:[41,64] package org.elasticsearch.search.aggregations.metrics.valuecount does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/AggregateQueryPageSource.java:[33,59] package org.elasticsearch.search.aggregations.metrics.stats does not exist
[ERROR] /Users/mateuszgajewski/Projects/src/github.com/trinodb/trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/AggregateQueryPageSource.java:[299,45] cannot find symbol

wendigo avatar Feb 19 '24 14:02 wendigo

@mosabua now the tests are running without any failures can you help me to get the changes at the earliest.

murthy-chelankuri avatar Feb 20 '24 05:02 murthy-chelankuri

Can you please squash the commits and ensure the build actually passes. Currently CI is still failing.

mosabua avatar Feb 21 '24 02:02 mosabua

@mosabua Thank you for the information. I have squashed the commits and the pipeline has passed all checks.

murthy-chelankuri avatar Feb 21 '24 06:02 murthy-chelankuri

Can you change the commit message to

Add support for TopN and aggregation pushdown

@murthy-chelankuri

I will see if @wendigo @ebyhr or others are available for review help.

mosabua avatar Feb 21 '24 22:02 mosabua

Can you change the commit message to

Add support for TopN and aggregation pushdown

@murthy-chelankuri

I will see if @wendigo @ebyhr or others are available for review help.

Changed the commit message as per your suggestion

murthy-chelankuri avatar Feb 22 '24 06:02 murthy-chelankuri

@mosabua can you help to prioritize the code review for this PR before we lose traction?

murthy-chelankuri avatar Feb 29 '24 23:02 murthy-chelankuri

Looking forward to this feature!

jbackofe avatar Mar 06 '24 20:03 jbackofe

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Mar 28 '24 17:03 github-actions[bot]

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar May 28 '24 17:05 github-actions[bot]

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

github-actions[bot] avatar Jun 18 '24 17:06 github-actions[bot]