sql icon indicating copy to clipboard operation
sql copied to clipboard

[FEATURE] OpenSearch and EMR-Serverless Integration

Open penghuo opened this issue 2 years ago • 0 comments

Phase - 1: Spark Connector and Flint API Support [Done]

  • OpenSearch Release: 2.9.0

Phase - 2: Support EMR-Serverless as compute engine.

Goals

  • OpenSearch Release: 2.11.0
    • Support user could create GlueS3 datasource.
    • Support User could configure EMR-S as compute engine.
    • Add new async query API.

Tasks

  • [x] Add GlueS3 datasource
  • [ ] EMR-S Interface
    • [x] Add EMR-S configuration
    • [x] Add EMR-S and Spark configuration when submit job to EMR-S
  • [x] Add Async query API
    • [x] Add Create Job API
      • [x] Add AuthZ to user has permission to access datasource
    • [x] Add Fetch job result API
    • [ ] Add Transport Action to Security Plugin
  • [x] Query Engine
    • [x] Support parsing datasource from SparkSQL and PPL query
    • [x] Support difference DDL / DQL / Streaming Query.
  • [x] Query Result Index
    • [x] Query Result Index Specification
    • [x] Query Result Index Template
    • [x] Query Result Index Life cycle management

Phase - 3: Add Session based query execution engine

  • [x] #2271
  • [ ] https://github.com/opensearch-project/sql/issues/2332
  • [x] https://github.com/opensearch-project/sql/pull/2312
  • [x] https://github.com/opensearch-project/sql/pull/2290
    • [x] Add allocateSession
  • [x] https://github.com/opensearch-project/sql/pull/2290
    • [x] Add Session State Machine
    • [x] Add Session State store
  • [ ] Add statement
    • [x] https://github.com/opensearch-project/sql/pull/2294
    • [x] https://github.com/opensearch-project/opensearch-spark/issues/79
    • [x] https://github.com/opensearch-project/opensearch-spark/issues/80
    • [x] https://github.com/opensearch-project/sql/issues/2333
  • [x] https://github.com/opensearch-project/sql/pull/2327
    • [x] Integrate with query execution
    • [x] Integrate with fetch query result
    • [x] Integrate with cancel query
    • [x] Add session management index for each datasource
  • [ ] Test
    • [ ] Integ test with SparkApp

Phase - 4: Add batch session and streaming session

  • [ ] Deployment
    • [ ] https://github.com/opensearch-project/sql/issues/2334
  • [ ] Fault tolerant
    • [ ] https://github.com/opensearch-project/sql/issues/2330
    • [x] #2344
    • [ ] #2345
  • [ ] Operation
    • [x] https://github.com/opensearch-project/sql/issues/2331
  • [ ] Add Batch Session
  • [ ] Add Streaming Session

penghuo avatar Sep 12 '23 23:09 penghuo