incubator-hugegraph icon indicating copy to clipboard operation
incubator-hugegraph copied to clipboard

[Summary] integrate new modules PD and Store components into hugegraph (Breaking Change)

Open VGalaxies opened this issue 1 year ago • 1 comments

Feature Description (功能描述)

  • mailing list link: https://lists.apache.org/thread/w92mrt8b8n93yvxfqy9hvy17p355hscg
  • preview release: https://github.com/hugegraph/hugegraph/releases/tag/pd-store-tmp
  • deployment guide: https://github.com/apache/incubator-hugegraph/wiki/HugeGraph-Distributed-(pd%E2%80%90store)-Version-Deployment-Guide
  • project tracking: https://github.com/orgs/apache/projects/316
  • sub summary issue:
    • https://github.com/apache/incubator-hugegraph/issues/2481
    • https://github.com/apache/incubator-hugegraph/issues/2482
    • https://github.com/apache/incubator-hugegraph/issues/2483
    • https://github.com/apache/incubator-hugegraph/issues/2499
    • https://github.com/apache/incubator-hugegraph/issues/2512
Relative Issues #435 #828 #1218 #1398 #1517 #1581 #1609 #1760 #1925 #1968 #1979

Background

Currently, the architecture of the community version of HugeGraph is still in version 1.0. In the internal version 2.0 of HugeGraph, we have the following design goals:

  • Support trillion-scale data storage and tens of thousands of graph storages.
  • Support multi-active, high availability, dynamic scalability, and automated operation and maintenance.
  • Maximize read and write performance.

Based on these goals, we have designed a distributed architecture that supports graph data partitioning and multiple replicas, and separates storage from computation for flexible scaling.

In version 2.0, in addition to the hugegraph-server, we also introduce two additional components: hugegraph-pd and hugegraph-store

image

The responsibilities of these two components are as follows:

  • hugegraph-pd: pd stands for placement driver, which can be simply understood as a meta server responsible for service discovery, partition information storage, and node scheduling.
  • hugegraph-store: as a new built-in storage backend, it uses RocksDB as the distributed backend storage foundation.

We will gradually merge the internal version 2.0 of HugeGraph into the community version. Therefore, the first step is to integrate the hugegraph-pd and hugegraph-store modules into this repository.

Tasks

Introduce PD and Store on the pd-store branch.

Currently, all tasks should be carried out on the pd-store branch:

  • Adjust the project structure of this repository to include three sub-modules: hugegraph-server, hugegraph-pd, hugegraph-store at the root level.
    • [x] https://github.com/apache/incubator-hugegraph/pull/2266
  • Merge the internal version of pd into hugegraph.
    • [x] https://github.com/apache/incubator-hugegraph/pull/2270
    • [x] https://github.com/apache/incubator-hugegraph/pull/2273
  • Merge the internal version of store into hugegraph.
    • [x] https://github.com/apache/incubator-hugegraph/pull/2272
  • Adjust the pom structure in order to directly build from root.
    • [x] https://github.com/apache/incubator-hugegraph/pull/2275
    • [x] https://github.com/apache/incubator-hugegraph/pull/2289
  • Configuration adjustments (including merging dependencies, unifying version numbers, Maven compilation adjustments, etc.).
    • [x] https://github.com/apache/incubator-hugegraph/pull/2288
    • [x] https://github.com/apache/incubator-hugegraph/pull/2305
    • [x] https://github.com/apache/incubator-hugegraph/pull/2321
  • Perform code adaptation tasks (hugegraph-server needs to introduce modules to connect with hugegraph-pd and hugegraph-store, and make adjustments to other modules, etc.).
    • [x] https://github.com/apache/incubator-hugegraph/pull/2301
    • [x] https://github.com/apache/incubator-hugegraph/pull/2319
  • Synchronization from the master branch.
    • [x] https://github.com/apache/incubator-hugegraph/pull/2345
    • [x] https://github.com/apache/incubator-hugegraph/pull/2414
Project structure adjustments on the master branch.

After https://github.com/apache/incubator-hugegraph/pull/2301, it should be possible to perform simple CRUD operations on HugeGraph on the pd-store branch. In the next phase, the pd-store branch will need to be merged into the master branch with an appropriate granularity.

  • (master) project structure adjustments and CI configurations
    • [x] https://github.com/apache/incubator-hugegraph/pull/2338
    • [x] https://github.com/apache/incubator-hugegraph/pull/2382

We will reorganize commit messages and merge changes ahead of the pd-store branch into the master branch:

  • [x] pd-grpc, pd-common, pd-client (~8k loc)
  • [x] pd-core (~9k loc)
  • [x] pd-service (~11k loc)
  • [x] pd-dist, pd-test (~1k loc)

  • [x] store-grpc, store-common, store-client (~13k loc)
  • [x] store-rocksdb (~4k loc)
  • [x] store-core (~14k loc)
  • [x] store-node (~11k loc)
  • [x] store-dist, store-test, store-cli (~3k loc)

  • [x] server-hstore (~3k loc)
  • [x] server-core (~12k loc)
flowchart TD
  A(hg-pd-grpc)
  B(hg-pd-common)
  C(hg-pd-client)
  D(hugegraph-core)
  E(hugegraph-hstore)
  F(hg-store-grpc)
  G(hg-store-common)
  H(hg-store-client)
  J(hg-pd-core)
  K(hg-pd-service)
  L(hg-pd-dist)
  M(hg-store-cli)
  N(hg-store-rocksdb)
  O(hg-store-core)
  P(hg-store-node)
  Q(hg-store-dist)
  B-->A
  C-->B
  D-.->C
  E-->D
  H-->F
  H-->G
  H-->C
  D-->G
  E-->H
  O-->D
  J-->B
  K-->J
  L-->K
  M-->H
  N-->G
  O-.->H
  O-->N
  P-->O
  Q-->P

Project Structure

# new project structure
                   - core/api/test/...
          - server - hbase/rocksdb/mysql/...
          -        - ...
hugegraph - pd - pd submodules
          -
          - store - store submodules

VGalaxies avatar Aug 03 '23 03:08 VGalaxies

please add store related sub-task to this issue, thx~

Pengzna avatar Mar 16 '24 12:03 Pengzna