incubator-hugegraph [Question] 关于hugegraph分布式集群的几个问题

Problem Type (问题类型)

No response

Before submit

[X] 我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题

Environment (环境信息)

Server Version: v0.12.x
Backend: RocksDB 3 nodes, HDD
OS: 1 CPUs, 32 G RAM, ubuntu 20.04
Data Size: 1千万 vertices, 4千万 edges

Your Question (问题描述)

1、当我使用loader批量导入数据时，发现在导入后会存在后端rocksDB存储分布不均的情况，比如说master磁盘占用20GB，worker节点才2G多，我都是用的默认配置，请问下需要额外配置什么参数实现负载均衡吗？ 2、我理解目前hugegraph分布式主要体现在分布式存储上，并没有找到分布式计算能力，那在OLAP场景下会不会存在瓶颈？后续有计划实现分布式计算能力？

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

Sep 21 '22 02:09 jokerCoCo

现在默认的 release 版本应该是多副本形式把, 为何会存在分步不均的情况, 你可以多补充(编辑)一些配置信息和上下文.

第二个分布式计算能力正在做是核心 feat, 多个请求的 api 是可以自动被分散到多台不同节点的, 一个请求内部的需要比较大的改动和测试.

Sep 21 '22 13:09 imbajin

现在的默认发布版本应该是多副本形式把，为什么会存在分步不均的情况，你可以多（编辑）一些配置信息和时间。

第二个不同计算能力正在做的内部是核心核心，多个请求的api是自动被分割到多个台请求的节点的，比较大的一个请求和测试。嗯嗯。下面是我配置文件的相关信息，是1主2从的形式，后端采用rockdb做存储 rest-server.properties的配置

rpc.server_host=host1
rpc.remote_url=host1,host2,host3
server.id=server-1
server.role=master`

`rpc.server_host=host2
rpc.remote_url=host1,host2,host3

server.id=server-2
server.role=worker`

`rpc.server_host=host3
rpc.remote_url=host1,host2,host3
server.id=server-3
server.role=worker

上述配置也是我从其他问题中了解到的，host都是我服务器的ip地址

hugegraph.properties配置对于raft参数配置我主要改了raft.mode、raft.safe_read、raft.endpoint、raft.group_peers这几个参数，其他都是默认配置 master配置

# gremlin entrance to create graph
# auth config: com.baidu.hugegraph.auth.HugeFactoryAuthProxy
gremlin.graph=com.baidu.hugegraph.HugeFactory

# cache config
#schema.cache_capacity=100000
# vertex-cache default is 1000w, 10min expired
vertex.cache_type=l2
#vertex.cache_capacity=10000000
#vertex.cache_expire=600
# edge-cache default is 100w, 10min expired
edge.cache_type=l2
#edge.cache_capacity=1000000
#edge.cache_expire=600


# schema illegal name template
#schema.illegal_name_regex=\s+|~.*

#vertex.default_label=vertex

backend=rocksdb
serializer=binary

store=hugegraph1

raft.mode=true
raft.safe_read=true
raft.use_snapshot=false
raft.endpoint=192.168.1.16:8282
raft.group_peers=192.168.1.16:8282,192.168.1.17:8282,192.168.1.18:8282
raft.path=./raft-log
raft.use_replicator_pipeline=true
raft.election_timeout=10000
raft.snapshot_interval=3600
raft.backend_threads=48
raft.read_index_threads=8
raft.read_strategy=ReadOnlyLeaseBased
raft.queue_size=16384
raft.queue_publish_timeout=60
raft.apply_batch=1
raft.rpc_threads=80
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60000

search.text_analyzer=jieba
search.text_analyzer_mode=INDEX

# rocksdb backend config
rocksdb.data_path=/path/to/disk1
rocksdb.wal_path=/path/to/disk1


# cassandra backend config
cassandra.host=localhost
cassandra.port=9042
cassandra.username=
cassandra.password=
#cassandra.connect_timeout=5
#cassandra.read_timeout=20
#cassandra.keyspace.strategy=SimpleStrategy
#cassandra.keyspace.replication=3

# hbase backend config
#hbase.hosts=localhost
#hbase.port=2181
#hbase.znode_parent=/hbase
#hbase.threads_max=64

# mysql backend config
#jdbc.driver=com.mysql.jdbc.Driver
#jdbc.url=jdbc:mysql://127.0.0.1:3306
#jdbc.username=root
#jdbc.password=
#jdbc.reconnect_max_times=3
#jdbc.reconnect_interval=3
#jdbc.sslmode=false

# postgresql & cockroachdb backend config
#jdbc.driver=org.postgresql.Driver
#jdbc.url=jdbc:postgresql://localhost:5432/
#jdbc.username=postgres
#jdbc.password=
#jdbc.postgresql.connect_database=template1

# palo backend config
#palo.host=127.0.0.1
#palo.poll_interval=10
#palo.temp_dir=./palo-data
#palo.file_limit_size=32

worker1配置

raft.mode=true
raft.safe_read=true
raft.use_snapshot=false
raft.endpoint=192.168.1.17:8282
raft.group_peers=192.168.1.16:8282,192.168.1.17:8282,192.168.1.18:8282
raft.path=./raft-log
raft.use_replicator_pipeline=true
raft.election_timeout=10000
raft.snapshot_interval=3600
raft.backend_threads=48
raft.read_index_threads=8
raft.read_strategy=ReadOnlyLeaseBased
raft.queue_size=16384
raft.queue_publish_timeout=60
raft.apply_batch=1
raft.rpc_threads=80
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60000

worker2配置

raft.mode=true
raft.safe_read=true
raft.use_snapshot=false
raft.endpoint=192.168.1.18:8282
raft.group_peers=192.168.1.16:8282,192.168.1.17:8282,192.168.1.18:8282
raft.path=./raft-log
raft.use_replicator_pipeline=true
raft.election_timeout=10000
raft.snapshot_interval=3600
raft.backend_threads=48
raft.read_index_threads=8
raft.read_strategy=ReadOnlyLeaseBased
raft.queue_size=16384
raft.queue_publish_timeout=60
raft.apply_batch=1
raft.rpc_threads=80
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60000

Sep 23 '22 01:09 jokerCoCo

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label

Oct 08 '22 21:10 github-actions[bot]

@jokerCoCo 可参考 https://github.com/apache/incubator-hugegraph/issues/1979#issuecomment-1325085377 ，分布式计算见 hugegraph-computer 仓库。

Nov 23 '22 13:11 javeme