superset icon indicating copy to clipboard operation
superset copied to clipboard

Support AWS OpenSearch / Elasticsearch

Open nelson-lark opened this issue 3 years ago • 10 comments

Is your feature request related to a problem? Please describe. As of a couple months ago, the Elasticsearch organization has made the official python elasticsearch plugin incompatible with Amazon supported OpenSearch. If you fire up Superset using the current helm chart and attempt to connect to a recently deployed AWS "Elasticsearch" - which is now an Apache 2.0 licensed OpenSearch - you will receive the ambiguous error from Superset

superset 2021-09-16 17:34:09,348:WARNING:superset.views.base:[SupersetError(message='(builtins.NoneType) None\n(Background on this error at: http://sqlalche.me/e/13/dbapi)', error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'ElasticSearch (SQL API)', 'issue_codes': [{'code': 1002, ' 
 message': 'Issue 1002 - The database returned an unexpected error.'}]})]                                                                                                                                                                                                                                                                                                              

If I run python directly on the Superset pod and connect to AWS "OpenSearch/Elasticsearch" cluster, i get the more detailed error from the elasticsearch python module directly:

elasticsearch.exceptions.UnsupportedProductError: The client noticed that the server is not a supported distribution of Elasticsearch

It would be nice if Superset could support OpenSearch natively.

Describe the solution you'd like

  • At the least, highlight in the docs that the elasticsearch plugin needs downgrading to be compatible with AWS Elasticsearch/OpenSearch clusters for the time being.
  • Find a way to support OpenSearch natively. Quite annoyingly, I think AWS only published a java library for OpenSearch. An official python library that could be used by Sqlalchemy is needed I believe.

Describe alternatives you've considered I am currently using the following boostrapScript which downgrades the elasticsearch python plugin to the last known compatible version (before Elasticsearch became hostile to AWS, rightfully or not).

bootstrapScript: |
  #!/bin/bash
  rm -rf /var/lib/apt/lists/* && \
  pip uninstall elasticsearch && \
  pip install \
    elasticsearch==7.13.4 \
    elasticsearch-dbapi==0.2.5 \
    elasticsearch-dbapi[opendistro] \
    psycopg2==2.8.5 \
    redis==3.2.1 && 
  if [ ! -f ~/bootstrap ]; then echo "Running Superset with uid {{ .Values.runAsUser }}" > ~/bootstrap; fi

Additional context Ultimately, the fix may need to be made in SQLALCHEMY since that is the middle man between Superset and the databases, but I thought it was worth requesting here in case anyone else is trying to connect to an Amazon "Elasticsearch"/Opensearch cluster and was having difficulty.

nelson-lark avatar Sep 17 '21 01:09 nelson-lark

  • At the least, highlight in the docs that the elasticsearch plugin needs downgrading to be compatible with AWS Elasticsearch/OpenSearch clusters for the time being.

@nelson-lark thanks for the note! i'm not familiar with this area, looping @craig-rueda in. 🙏 please do feel free to open a PR to update the docs accordingly! 🙏

junlincc avatar Sep 17 '21 18:09 junlincc

:wave: I'm on the OpenSearch team.

Here is some background on the client issue that appears to be causing this. I'm only vaguely familiar with SuperSet but I believe the issue originates here:

https://github.com/preset-io/elasticsearch-dbapi/blob/master/setup.py#L31

Effectively, Elasticsearch clients past will only work with non-OSS Elasticsearch moving forward (this break also affects Elasticsearch versions prior to 7.12 IIRC). The break occurred at ES Client v7.14, so the way by including elasticsearch>7 new installs are pulling the latest which includes the blocking code.

Opensearch clients (python client here) will work with both OpenSearch and ES and are syntax compatible. Not sure how SuperSet as a project wants to handle this, but I'm glad to help personally and get some other folks from our side to contribute.

stockholmux avatar Sep 23 '21 13:09 stockholmux

Hi @stockholmux, I'm a maintainer/creator of elasticsearch-dbapi. I'll look into including opensearch-py on elasticsearch-dbapi

Thank you!

dpgaspar avatar Sep 23 '21 13:09 dpgaspar

Related issue on the driver repo: https://github.com/preset-io/elasticsearch-dbapi/issues/70

dpgaspar avatar Sep 27 '21 09:09 dpgaspar

Can we close this @dpgaspar ? 🤔

srinify avatar Oct 05 '21 19:10 srinify

it's not 100% solved, latest elasticsearch-dbapi release will not throw any errors because I've pinned down elastic elasticsearch API, so that it didn't include the AWS OpenSearch lost of compatibility. This is a quick fix, the final solution would be to include the new opensearch-py on elasticsearch-dbapi.

dpgaspar avatar Oct 05 '21 20:10 dpgaspar

@stockholmux I have some trouble using the elasticsearch-dbapi==0.2.6, Here is the detailed description of the issue: https://github.com/apache/superset/issues/17347

Any help is greatly appreciated. Thanks in advance!

harshgadhia avatar Nov 05 '21 00:11 harshgadhia

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

stale[bot] avatar Apr 17 '22 22:04 stale[bot]

hello everyone! I'm currently work in a project needing to connect the SuperSet and OpenSearch.

I forked the elasticsearch-dbapi project and changed the opendistro for opensearch dependencies. It's worked well, enabled to connect, create the database and datasets, everything was perfect until the SQL aggregations for the Charts.

When the SuperSet parse a SQL statement with aggretation it's put an alias using same aggregation function name and, in this point, the OpenSearch don't recognize this alias.

SELECT count(codigo) AS count(codigo) FROM cnes_estabelecimento_saude LIMIT 50000;

So, I'm trying to figure out how don't create the alias or to not use the aggregation function in the alias.

@stockholmux please, could you check if that make sense?

Edit:

I figured out, there is a sanitize function in api, where remove the quotes https://github.com/preset-io/elasticsearch-dbapi/blob/master/es/opendistro/api.py#L291, so just removed the line and put the quotes in the last one to previne the dummy schema set, voilà! Worked like a charm.

@dpgaspar I will generate the correct file, removing the numbers of try errors from my code and create a MR to colaborate with the main project soon as possible. Thank you so much.

Unanimad avatar May 18 '22 10:05 Unanimad

@Unanimad Do you still have plan to create MR to elasticsearch-dbapi project?

gzcf avatar Sep 14 '22 11:09 gzcf

update bootstrapScript:

pip uninstall elasticsearch && \
pip install \
            elasticsearch==7.10.1 \
            elasticsearch-dbapi==0.2.4 \

In this way, you can connect to opensearch2.5, and then use url:

odelasticsearch+https://my_user:[email protected]:443/

izerui avatar Mar 30 '23 04:03 izerui

Hi everybody, doing some scouting in alternatives to create internal data BI tools. For us Opensearch is one of the main data sources. Is there any update on this ticket?

ungarida avatar Jul 17 '23 12:07 ungarida

It would also help us if this was formally resolved into separate elasticsearch and opensearch support.

ftrotter avatar Sep 28 '23 17:09 ftrotter

Closing this for a few reasons:

  1. It's a feature request rather than a bug report, so it should be a GitHub Discussion or anyone can just open a PR
  2. It sounds like it's at least partly solved by now(?)
  3. It's conflating two feature requests/DBs, which should be tracked separately
  4. It's been silent/stale for upward of 6 months.

Happy to continue the discussion and revisit/reopen as necessary.

rusackas avatar Mar 19 '24 18:03 rusackas