incubator-hugegraph icon indicating copy to clipboard operation
incubator-hugegraph copied to clipboard

[Question] 关于HugeGraph分布式部署的详细教程 [已解决/待反馈]

Open ZQYKing opened this issue 6 months ago • 23 comments

Problem Type (问题类型)

configs (配置项 / 文档相关)

Before submit

  • [x] 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: 1.0.0 (Apache Release Version)
  • Backend: RocksDB x nodes, HDD or SSD
  • OS: xx CPUs, xx G RAM, Ubuntu 2x.x / CentOS 7.x
  • Data Size: xx vertices, xx edges

Your Question (问题描述)

Using G1GC as the default garbage collector 2025-06-07 18:08:34 [main] [INFO] o.a.h.s.RestServer - RestServer starting... 2025-06-07 18:08:38 [main] [INFO] o.a.h.u.ConfigUtil - Scanning option 'graphs' directory './conf/graphs' 2025-06-07 18:08:38 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.endpoint' is redundant, please ensure it has been registered 2025-06-07 18:08:38 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.election_timeout_ms' is redundant, please ensure it has been registered 2025-06-07 18:08:38 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.rpc_connect_timeout_ms' is redundant, please ensure it has been registered 2025-06-07 18:08:38 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.rpc_timeout_ms' is redundant, please ensure it has been registered 2025-06-07 18:08:38 [main] [INFO] o.a.h.b.s.BackendProviderFactory - Opening backend store 'rocksdb' in raft mode for graph 'hugegraph' 2025-06-07 18:08:38 [main] [INFO] o.a.h.b.s.r.RaftBackendStoreProvider - Init raft backend schema store 2025-06-07 18:08:38 [db-open-1] [INFO] o.a.h.b.s.r.RocksDBStore - Opening RocksDB with data path: ./data/m 2025-06-07 18:08:39 [main] [INFO] o.a.h.b.c.CacheManager - Init RamCache for 'schema-id-hugegraph' with capacity 10000 2025-06-07 18:08:39 [main] [INFO] o.a.h.b.c.CacheManager - Init RamCache for 'schema-name-hugegraph' with capacity 10000 2025-06-07 18:08:39 [main] [INFO] o.a.h.b.s.r.RaftBackendStoreProvider - Init raft backend system store 2025-06-07 18:08:39 [db-open-1] [INFO] o.a.h.b.s.r.RocksDBStore - Opening RocksDB with data path: ./data/s 2025-06-07 18:08:41 [main] [INFO] o.a.h.b.s.r.RaftBackendStoreProvider - Init raft backend graph store 2025-06-07 18:08:41 [db-open-1] [INFO] o.a.h.b.s.r.RocksDBStore - Opening RocksDB with data path: ./data/g 2025-06-07 18:08:41 [main] [INFO] o.c.o.l.Uns - OHC using JNA OS native malloc/free 2025-06-07 18:08:41 [main] [INFO] o.a.h.b.c.CacheManager - Init LevelCache for 'vertex-hugegraph' with capacity 10000:10000000 2025-06-07 18:08:41 [main] [INFO] o.a.h.b.c.CacheManager - Init LevelCache for 'edge-hugegraph' with capacity 1000:1000000 2025-06-07 18:08:41 [main] [INFO] o.a.h.t.TaskManager - Use local as the scheduler of graph (hugegraph) 2025-06-07 18:08:41 [main] [INFO] o.a.h.b.c.CacheManager - Init RamCache for 'users-hugegraph' with capacity 10240 2025-06-07 18:08:41 [main] [INFO] o.a.h.b.c.CacheManager - Init RamCache for 'users_pwd-hugegraph' with capacity 10240 2025-06-07 18:08:41 [main] [INFO] o.a.h.b.c.CacheManager - Init RamCache for 'token-hugegraph' with capacity 10240 2025-06-07 18:08:41 [main] [INFO] o.a.h.u.ConfigUtil - Scanning option 'graphs' directory './conf/graphs' 2025-06-07 18:08:41 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.endpoint' is redundant, please ensure it has been registered 2025-06-07 18:08:41 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.election_timeout_ms' is redundant, please ensure it has been registered 2025-06-07 18:08:41 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.rpc_connect_timeout_ms' is redundant, please ensure it has been registered 2025-06-07 18:08:41 [main] [WARN] o.a.h.c.HugeConfig - The config option 'raft.rpc_timeout_ms' is redundant, please ensure it has been registered 2025-06-07 18:08:41 [main] [INFO] o.a.h.c.GraphManager - Graph 'hugegraph' was successfully configured via './conf/graphs/hugegraph.properties' 2025-06-07 18:08:41 [main] [WARN] o.a.h.c.GraphManager - You may need to support access control for './conf/graphs/hugegraph.properties' with gremlin.graph=org.apache.hugegraph.auth.HugeFactoryAuthProxy 2025-06-07 18:08:41 [main] [WARN] c.a.s.r.b.DefaultProviderBootstrap - Method with same name "org.apache.hugegraph.auth.AuthManager.validateUser" exists ! The usage of overloading method in rpc is deprecated. 2025-06-07 18:08:42 [main] [INFO] o.a.h.r.RpcServer - RpcServer started success on port 8091 2025-06-07 18:08:42 [main] [INFO] o.a.h.b.s.r.RaftContext - Start raft server successfully: 192.168.230.128:8091 2025-06-07 18:08:42 [main] [WARN] c.a.s.j.s.s.l.LocalSnapshotStorage - No data for snapshot reader ./raftlog/snapshot. 2025-06-07 18:08:42 [main] [WARN] c.a.s.j.RaftGroupService - RPC server is not started in RaftGroupService. 2025-06-07 18:08:42 [main] [INFO] o.a.h.b.s.r.RaftNode - Start raft node: [hugegraph-192.168.230.128:8091] 2025-06-07 18:08:42 [main] [INFO] o.a.h.b.s.r.RaftBackendStoreProvider - The raft node is initialized 2025-06-07 18:08:42 [main] [INFO] o.a.h.b.s.r.RaftNode - Waiting for raft group 'hugegraph' leader elected 6月 07, 2025 6:08:44 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:46 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:48 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:50 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:52 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:54 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:56 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception. 6月 07, 2025 6:08:58 下午 org.glassfish.jersey.server.ServerRuntime$Responder process 严重: Error occurred when processing a response created from an already mapped exception.

Vertex/Edge example (问题点 / 边数据举例)


Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)


ZQYKing avatar Jun 07 '25 10:06 ZQYKing

[root@localhost conf]# cat rest-server.properties

bind url

could use '0.0.0.0' or specified (real)IP to expose external network access

restserver.url=http://0.0.0.0:8080 #restserver.enable_graphspaces_filter=false

gremlin server url, need to be consistent with host and port in gremlin-server.yaml

#gremlinserver.url=http://127.0.0.1:8182

graphs=./conf/graphs

The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0

batch.max_write_ratio=80 batch.max_write_threads=0

configuration of arthas

arthas.telnet_port=8562 arthas.http_port=8561 arthas.ip=0.0.0.0 arthas.disabled_commands=jad

authentication configs

choose 'org.apache.hugegraph.auth.StandardAuthenticator' or

'org.apache.hugegraph.auth.ConfigAuthenticator'

auth.authenticator=org.apache.hugegraph.auth.StandardAuthenticator

for StandardAuthenticator mode

auth.graph_store=hugegraph

auth client config

#auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897

for ConfigAuthenticator mode

#auth.admin_token= #auth.user_tokens=[]

TODO: Deprecated & removed later (useless from version 1.5.0)

rpc server configs for multi graph-servers or raft-servers

rpc.server_host=192.168.230.128 rpc.server_port=8091 rpc.server_timeout=30

rpc client configs (like enable to keep cache consistency)

rpc.remote_url=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091 rpc.client_connect_timeout=60 rpc.client_reconnect_period=15 rpc.client_read_timeout=60 rpc.client_retries=5 rpc.client_load_balancer=consistentHash #rpc.client_connect_timeout=20 #rpc.client_reconnect_period=10 #rpc.client_read_timeout=40 #rpc.client_retries=3 #rpc.client_load_balancer=consistentHash

raft group initial peers

raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091

lightweight load balancing (beta)

server.id=server1 server.role=master

slow query log

log.slow_query_threshold=1000

jvm(in-heap) memory usage monitor, set 1 to disable it

memory_monitor.threshold=0.85 memory_monitor.period=2000

ZQYKing avatar Jun 07 '25 10:06 ZQYKing

[root@localhost conf]# cat graphs/hugegraph.properties

gremlin entrance to create graph

auth config: org.apache.hugegraph.auth.HugeFactoryAuthProxy

gremlin.graph=org.apache.hugegraph.HugeFactory

cache config

#schema.cache_capacity=100000

vertex-cache default is 1000w, 10min expired

vertex.cache_type=l2 #vertex.cache_capacity=10000000 #vertex.cache_expire=600

edge-cache default is 100w, 10min expired

edge.cache_type=l2 #edge.cache_capacity=1000000 #edge.cache_expire=600

schema illegal name template

#schema.illegal_name_regex=\s+|~.*

#vertex.default_label=vertex

backend=rocksdb serializer=binary

store=hugegraph

pd config

#pd.peers=127.0.0.1:8686

task config

task.scheduler_type=local task.schedule_period=10 task.retry=0 task.wait_timeout=10

search config

search.text_analyzer=jieba search.text_analyzer_mode=INDEX

rocksdb backend config

rocksdb.data_path=./data rocksdb.wal_path=./wal

cassandra backend config

cassandra.host=localhost cassandra.port=9042 cassandra.username= cassandra.password= #cassandra.connect_timeout=5 #cassandra.read_timeout=20 #cassandra.keyspace.strategy=SimpleStrategy #cassandra.keyspace.replication=3

hbase backend config

#hbase.hosts=localhost #hbase.port=2181 #hbase.znode_parent=/hbase #hbase.threads_max=64

IMPORTANT: recommend to modify the HBase partition number

by the actual/env data amount & RS amount before init store

It will influence the load speed a lot

#hbase.enable_partition=true #hbase.vertex_partitions=10 #hbase.edge_partitions=30

mysql backend config

#jdbc.driver=com.mysql.jdbc.Driver #jdbc.url=jdbc:mysql://127.0.0.1:3306 #jdbc.username=root #jdbc.password= #jdbc.reconnect_max_times=3 #jdbc.reconnect_interval=3 #jdbc.ssl_mode=false

postgresql & cockroachdb backend config

#jdbc.driver=org.postgresql.Driver #jdbc.url=jdbc:postgresql://localhost:5432/ #jdbc.username=postgres #jdbc.password= #jdbc.postgresql.connect_database=template1

palo backend config

#palo.host=127.0.0.1 #palo.poll_interval=10 #palo.temp_dir=./palo-data #palo.file_limit_size=32

WARNING: These raft configurations are deprecated, please use the latest version instead.

raft.mode=true raft.safe_read=true raft.endpoint=192.168.230.128:8091 raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091 raft.election_timeout_ms=60000 raft.rpc_connect_timeout_ms=60000 raft.rpc_timeout_ms=60000

ZQYKing avatar Jun 07 '25 10:06 ZQYKing

Hi @wanwancloud! I'm Dosu and I’m helping the incubator-hugegraph team.

你的配置和日志里出现的 raft 相关参数(如 raft.endpoint、raft.election_timeout_ms 等)已经被废弃,HugeGraph 现在只识别注册在 OptionSpace 里的参数,未注册的会被直接警告为“redundant”参考。建议只保留官方文档和源码里明确支持的 raft 配置项,比如 raft.mode、raft.safe_read、raft.group_peers、raft.election_timeout、raft.rpc_connect_timeout、raft.rpc_timeout 等,参数名和单位要与源码一致(比如 raft.election_timeout,单位毫秒,默认 10000)。

RocksDB 分布式部署常见问题包括 OOM 导致数据损坏、SST 文件 ahead of WALs、lock 文件未释放等。建议:

  1. JVM 内存调大(-Xmx/-Xms),
  2. 优化 rocksdb.write_buffer_size、rocksdb.max_write_buffer_number、rocksdb.block_cache_size,
  3. 多盘环境用 rocksdb.data_disks,
  4. 定期 compact,
  5. 遇到 SST/WAL 不一致可用 ldb repair 工具修复,
  6. lock 问题需确保所有进程都已关闭再初始化,必要时手动删除 lock 文件(注意数据安全)参考

Cassandra 后端 netty-all 兼容性问题,HugeGraph 1.5.0 推荐用 Cassandra 3.1x,4.x 存在 breaking change,建议降级到 3.1x 版本参考。1.7 及以后 Cassandra 将不再是主力支持后端,官方推荐生产环境用 HStore(Raft+RocksDB 组合),未来也会逐步淘汰 HBase/Cassandra参考

如果你需要分布式高可用,建议直接用 HStore(Raft+RocksDB),配置 raft.group_peers、raft.mode=true 即可,避免使用已废弃参数。遇到 raft 相关异常,需确保所有节点 raft 配置一致,路径可写,端口无冲突,且依赖包版本匹配参考

如需进一步定位 RocksDB 或 Cassandra 的具体报错,请补充详细的错误日志。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 07 '25 10:06 dosubot[bot]

HStore(Raft+RocksDB),配置 raft.group_peers、raft.mode=true 麻烦这个能提供下详细参数吗

ZQYKing avatar Jun 07 '25 10:06 ZQYKing

HStore(Raft+RocksDB)分布式部署时,核心配置参数如下:

  1. raft.group_peers:必填,指定所有 Raft 节点的 RPC 地址,格式如 raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091,所有节点需保持一致。参考

  2. raft.mode:已废弃,建议不要再配置,实际是否启用 raft 由 raft.group_peers 是否存在决定。参考

  3. 推荐 raft 相关参数(全部写在 graphs/hugegraph.properties,单位见注释):

# Raft 选举超时时间(毫秒,默认10000)
raft.election_timeout=10000
# Raft RPC 连接超时(毫秒,默认5000)
raft.rpc_connect_timeout=5000
# Raft RPC 超时(秒,默认60)
raft.rpc_timeout=60
# 是否开启并行日志复制(默认true)
raft.use_replicator_pipeline=true
# 后端线程数(默认CPU核数)
raft.backend_threads=8
# 读索引线程数(默认8)
raft.read_index_threads=8
  1. rocksdb 相关建议:合理设置 rocksdb.write_buffer_sizerocksdb.max_write_buffer_numberrocksdb.block_cache_size,多盘用 rocksdb.data_disks,定期 compact,JVM 内存建议调大。参考

  2. 端口分配:每个节点的 REST、Gremlin、RPC 端口不能冲突,建议如 8081/8181/8091,8082/8182/8092,8083/8183/8093。参考

  3. backend/serializer:分布式部署推荐 backend=hstoreserializer=binary参考

  4. 初始化:所有节点配置一致后,分别执行 init-store.sh 初始化,再启动服务。

注意:raft.endpointraft.election_timeout_msraft.rpc_connect_timeout_msraft.rpc_timeout_ms 等参数已废弃,需用上面新参数替换。参考

如需详细样例或遇到具体报错可补充日志。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 07 '25 11:06 dosubot[bot]

raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091 这个参数写 rest-server.properties 这个里面吗,还是graphs/hugegraph.properties 这个里面。 已经部署了两周,docker。docker-compsoe,二进制都部署不出来集群,真难好难,麻烦指点下谢谢

ZQYKing avatar Jun 07 '25 11:06 ZQYKing

raft.group_peers 这个参数应该写在每个节点的 graphs/hugegraph.properties(或 hugegraph.properties)文件里,而不是 rest-server.properties。所有节点的 raft.group_peers 必须保持一致,格式如你写的这样。rest-server.properties 主要配置 REST 服务、RPC 服务等,不负责 Raft 集群成员发现。

如果你用的是新版 HStore(Raft+RocksDB)集群,配置方式已经逐步转向每个节点独立的 YAML 文件(如 application-store0.yml),每个节点用 raft.address 指定自己,所有节点用 pdserver.address 发现彼此,这时不再需要全局 raft.group_peers 参数,而是靠 PD 发现和管理集群成员。

如果你用的是传统 hugegraph-server 分布式(非 HStore),就把 raft.group_peers 写在 graphs/hugegraph.properties 里即可,其他 raft 相关参数(如 raft.election_timeout、raft.rpc_connect_timeout、raft.rpc_timeout)也建议写在这里,注意不要用 raft.endpoint、raft.election_timeout_ms 这类已废弃参数。

部署建议:所有节点配置完全一致,端口不冲突,初始化时每个节点都执行 init-store.sh,启动顺序建议 PD/Store/Server(如用 HStore),遇到报错可补充具体日志定位。参考1 参考2

如果你用 docker/docker-compose,注意环境变量要用 hugegraph. 前缀,挂载配置文件时确保每个节点配置独立且一致。参考

如还有具体报错,欢迎贴出详细日志。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 07 '25 11:06 dosubot[bot]

我现在 rest-server.properties 配置里面这样写 rpc.server_host=192.168.230.128 rpc.server_port=8091 rpc.remote_url=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091

graphs/hugegraph.properties 配置这样写 rocksdb.data_path=./data rocksdb.wal_path=./wal backend=rocksdb serializer=binary store=hugegraph raft.mode=true raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091

是否可行,能否组成集群

ZQYKing avatar Jun 07 '25 11:06 ZQYKing

你的配置有几点需要调整:

  1. raft.group_peers 应该只写在 graphs/hugegraph.properties 里,不要放在 rest-server.properties 里。rest-server.properties 只负责 REST/RPC 服务,不负责 Raft 集群发现。参考

  2. raft.mode 已经废弃,建议删除。只要配置了 raft.group_peers 就会自动进入 Raft 模式。参考

  3. raft.endpoint、raft.election_timeout_ms、raft.rpc_connect_timeout_ms、raft.rpc_timeout_ms 这些参数也都废弃了,建议用 raft.election_timeout、raft.rpc_connect_timeout、raft.rpc_timeout 替换,单位分别是毫秒和秒。参考

  4. 你的 rocksdb.data_path、rocksdb.wal_path、backend、serializer、store、raft.group_peers 都是必须的,没问题。所有节点 graphs/hugegraph.properties 必须完全一致,端口不能冲突。

  5. 建议升级到 HugeGraph 1.5.0 及以上版本,分布式部署更稳定,且 raft/hstore 相关配置更规范,需用 Java 11。参考

  6. rocksdb 性能建议:合理设置 rocksdb.write_buffer_size、rocksdb.max_write_buffer_number、rocksdb.block_cache_size,多盘用 rocksdb.data_disks,JVM 内存调大,定期 compact。参考

只要按上述调整,理论上可以组成 Raft 集群。如果还有异常,建议贴出具体报错日志。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 07 '25 11:06 dosubot[bot]

请问百亿边和点怎么设置jvm大小,在那个配置文件中设置参数怎么写,目前使用的是rocksdb存储,hugegraph1.5.0版本

ZQYKing avatar Jun 07 '25 11:06 ZQYKing

百亿级点边建议 JVM 堆内存(heap)至少 16GB 起步,生产环境推荐 32GB 甚至更高,具体取决于物理内存和业务压力。JVM 大小不是在配置文件里设置,而是通过启动参数(如 -Xmx32g -Xms32g)或环境变量 JAVA_OPTS 设置。

如果你用二进制包启动,直接修改启动脚本(如 bin/start-hugegraph.sh)或 export JAVA_OPTS="-Xms32g -Xmx32g ..." 后再启动。

如果用 Docker 或 docker-compose,建议在 compose 文件或 docker run 命令里加环境变量,比如:

environment:
  - JAVA_OPTS=-Xms32g -Xmx32g -XX:+UseG1GC

docker run -e JAVA_OPTS="-Xms32g -Xmx32g -XX:+UseG1GC" ...

HugeGraph 官方 Dockerfile 默认用 -XX:MaxRAMPercentage=50,你可以覆盖 JAVA_OPTS 让 JVM 用更多内存 参考

注意:JVM 堆内存建议不要超过物理内存一半,避免和 RocksDB native 内存抢资源。RocksDB 相关参数如 rocksdb.write_buffer_sizerocksdb.max_write_buffer_numberrocksdb.block_cache_size 也要合理调大,比如:

rocksdb.write_buffer_size=128MB
rocksdb.max_write_buffer_number=6
rocksdb.block_cache_size=8GB

这些写在 graphs/hugegraph.properties 里。

如遇 OOM 或性能瓶颈,建议结合监控日志和实际业务量动态调整 参考

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 07 '25 11:06 dosubot[bot]

rocksdb.write_buffer_size=128MB rocksdb.max_write_buffer_number=6 rocksdb.block_cache_size=8GB

设置以后报错 2025-06-07 19:37:19 [main] [ERROR] o.a.h.c.TypedOption - Invalid type of value '128MB' for option 'rocksdb.write_buffer_size' java.lang.reflect.InvocationTargetException: null at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.hugegraph.config.TypedOption.parse(TypedOption.java:149) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.config.TypedOption.parse(TypedOption.java:122) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.config.TypedOption.parseConvert(TypedOption.java:115) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.config.HugeConfig.validateOption(HugeConfig.java:137) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.config.HugeConfig.addPropertyDirect(HugeConfig.java:117) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.config.HugeConfig.addPropertyInternal(HugeConfig.java:130) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.commons.configuration2.AbstractConfiguration.addProperty(AbstractConfiguration.java:574) ~[commons-configuration2-2.8.0.jar:2.8.0] at org.apache.commons.configuration2.AbstractConfiguration.append(AbstractConfiguration.java:1331) ~[commons-configuration2-2.8.0.jar:2.8.0] at org.apache.hugegraph.config.HugeConfig.loadConfig(HugeConfig.java:75) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.config.HugeConfig.(HugeConfig.java:54) ~[hugegraph-common-1.5.0.jar:1.5.0] at org.apache.hugegraph.auth.StandardAuthenticator.setup(StandardAuthenticator.java:125) ~[hugegraph-api-1.5.0.jar:1.5.0] at org.apache.hugegraph.auth.HugeAuthenticator.loadAuthenticator(HugeAuthenticator.java:143) ~[hugegraph-api-1.5.0.jar:1.5.0] at org.apache.hugegraph.core.GraphManager.(GraphManager.java:100) ~[hugegraph-api-1.5.0.jar:0.71.0.0]

ZQYKing avatar Jun 07 '25 11:06 ZQYKing

分部署存储使用rocksdb存储,详细部署教程可以提供一下吗

ZQYKing avatar Jun 09 '25 02:06 ZQYKing

HugeGraph 分布式部署 RocksDB(Raft 模式)大致流程如下:

  1. 环境准备:每台机器安装 JDK 11+,建议 16GB 以上内存,SSD 更佳。下载 HugeGraph 1.5.0 二进制包或官方 Docker 镜像(推荐用 release tag)。

  2. 配置文件(每台节点 graphs/hugegraph.properties 必须一致):

backend=rocksdb
serializer=binary
store=hugegraph
rocksdb.data_path=./data
rocksdb.wal_path=./wal
raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091
# 推荐 raft 相关参数
raft.election_timeout=10000
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60
# RocksDB 性能参数(单位为字节,不能写 128MB,需写 134217728)
rocksdb.write_buffer_size=134217728
rocksdb.max_write_buffer_number=6
rocksdb.block_cache_size=8589934592

注意 raft.mode、raft.endpoint、raft.election_timeout_ms 等参数已废弃,不能再用。

  1. rest-server.properties 只需配置 REST/RPC 端口,graphs 路径等,不要写 raft.group_peers。

  2. 启动顺序:每台节点都执行一次 bin/init-store.sh 初始化(确保 data 目录无锁文件、无其他进程占用),然后启动服务 bin/start-hugegraph.sh。Docker 环境可用 JAVA_OPTS 环境变量设置 JVM 堆内存(如 -Xmx32g -Xms32g),docker-compose 需挂载各自配置文件,环境变量前缀用 hugegraph.。

  3. 端口分配:每台节点 REST、Gremlin、RPC 端口不能冲突,比如 8081/8181/8091,8082/8182/8092,8083/8183/8093。

  4. 常见问题:rocksdb.write_buffer_size、rocksdb.block_cache_size 等参数必须用字节数(如 134217728),不能写 128MB,否则报类型错误。遇到 lock 报错需确保无进程占用,必要时手动删除 lock 文件(确保无服务运行)参考

  5. 集群健康:所有节点配置完全一致,raft.group_peers 必须包含所有节点,节点间网络互通,data/wal 路径可写。

详细可参考官方配置模板和 分布式部署经验。如遇具体报错可贴日志进一步定位。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 09 '25 02:06 dosubot[bot]

使用rocksdb作为后端存储启动集群后数据不同步配置信息如下 rest-server.properties配置下面信息 raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091

graphs/hugegraph.properties配置信息如下 backend=rocksdb serializer=binary store=hugegraph rocksdb.data_path=./data rocksdb.wal_path=./wal raft.group_peers=192.168.230.128:8091,192.168.230.129:8091,192.168.230.130:8091 raft.election_timeout=10000 raft.rpc_connect_timeout=5000 raft.rpc_timeout=60 rocksdb.write_buffer_size=134217728 rocksdb.max_write_buffer_number=6 rocksdb.block_cache_size=8589934592

然后所有的节点bin/init-store.sh 初始化 然后start-hugegraph启动集群,在192.168.230.128:8080创建的顶边,在192.168.230.129:8080和192.168.230.130:8080节点无法查询 数据不同步应该怎么处理

ZQYKing avatar Jun 09 '25 03:06 ZQYKing

麻烦回答下上面的问题啊,

ZQYKing avatar Jun 09 '25 03:06 ZQYKing

192.168.1.10 192.168.1.11 192.168.1.12 部署分布式存储使用rocksdb存储,详细部署教程,以及每台服务器的配置文件麻烦提供下

ZQYKing avatar Jun 09 '25 03:06 ZQYKing

分布式部署 HugeGraph(RocksDB 后端)在 192.168.1.10、192.168.1.11、192.168.1.12 上,推荐如下流程和配置:


1. 环境准备
每台服务器安装 JDK 11+,建议 16GB 以上内存,SSD 更佳。下载 HugeGraph 1.5.0 release 包。


2. 配置文件(每台服务器都要有,内容几乎一致,仅端口和 data/wal 路径可区分)

以 3 台为例,假设 REST/Gremlin/RPC 端口分别为:

  • 192.168.1.10:8081/8181/8091
  • 192.168.1.11:8082/8182/8092
  • 192.168.1.12:8083/8183/8093

conf/rest-server.properties(每台都要改端口,graphs 路径保持一致):

restserver.url=http://0.0.0.0:808X   # X=1/2/3
graphs=./conf/graphs
rpc.server_host=192.168.1.1X         # X=0/1/2
rpc.server_port=809X                 # X=1/2/3
rpc.remote_url=192.168.1.10:8091,192.168.1.11:8092,192.168.1.12:8093

conf/graphs/hugegraph.properties(所有节点内容完全一致,data/wal 路径可区分):

backend=rocksdb
serializer=binary
store=hugegraph
rocksdb.data_path=./data
rocksdb.wal_path=./wal
raft.group_peers=192.168.1.10:8091,192.168.1.11:8092,192.168.1.12:8093
raft.election_timeout=10000
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60
rocksdb.write_buffer_size=134217728
rocksdb.max_write_buffer_number=6
rocksdb.block_cache_size=8589934592

注意 raft.mode、raft.endpoint、raft.election_timeout_ms 等参数不要再用。


3. 启动流程

  • 每台节点都执行一次 bin/init-store.sh 初始化(确保 data 目录无锁文件、无其他进程占用)。
  • 启动服务:bin/start-hugegraph.sh
  • Docker 环境用 JAVA_OPTS 设置 JVM 堆内存(如 -Xmx32g -Xms32g),docker-compose 需挂载各自配置文件,环境变量前缀用 hugegraph.。

4. 常见问题与排查

  • rocksdb.write_buffer_size、rocksdb.block_cache_size 等参数必须用字节数(如 134217728),不能写 128MB,否则报类型错误。
  • 数据不同步常见原因:配置不一致、端口冲突、节点未全部加入 raft.group_peers、data/wal 路径冲突、节点间网络不通、节点未全部初始化。
  • lock 报错需确保无进程占用,必要时手动删除 lock 文件(确保无服务运行)参考
  • 多实例部署时,每个实例的 data_path/wal_path 必须唯一,不能共用同一目录,否则会锁冲突 参考

5. 重要说明

  • 所有节点 graphs/hugegraph.properties 必须完全一致,raft.group_peers 必须包含所有节点,节点间网络互通,data/wal 路径可写。
  • 数据同步依赖 raft 协议,只有 leader 节点写入后,follower 节点才会同步数据,需等待 raft 日志同步完成。
  • 如遇 “SST file ahead of WALs” 等 RocksDB 错误,可尝试备份后用 ldb repair 工具修复 参考

如有具体报错或特殊需求可补充日志进一步定位。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 09 '25 03:06 dosubot[bot]

怎么验证集群,集群设置了用户名以及密码,怎么通过api请求获取集群状态

ZQYKing avatar Jun 09 '25 04:06 ZQYKing

验证 HugeGraph 集群状态,可以通过 REST API 查询 Raft 组成员和 leader 信息,以及系统和后端健康指标。由于你已设置用户名和密码,API 请求需带 HTTP Basic Auth 认证。

常用接口如下:

  1. 查询 Raft 集群所有节点(peers):
GET http://<host>:<port>/graphs/<graph>/raft/list_peers
  1. 查询当前 leader 节点:
GET http://<host>:<port>/graphs/<graph>/raft/get_leader
  1. 查询系统健康指标:
GET http://<host>:<port>/metrics/system
  1. 查询后端存储健康指标:
GET http://<host>:<port>/metrics/backend

认证方式:
请求头需加

Authorization: Basic <base64编码后的用户名:密码>

例如用户名 admin,密码 123456,base64 编码为 YWRtaW46MTIzNDU2,完整请求头为:

Authorization: Basic YWRtaW46MTIzNDU2

你可以用 curl 测试(以查询 peers 为例):

curl -X GET "http://192.168.1.10:8081/graphs/hugegraph/raft/list_peers" \
  -H "Authorization: Basic YWRtaW46MTIzNDU2"

返回结果会列出所有 raft 节点,正常情况下所有节点都在列表里,且 leader 信息能正常获取。
如需获取系统/后端指标,只需将 URL 换成 /metrics/system 或 /metrics/backend 即可。

接口需 admin 权限,普通用户无权访问参考 RaftAPI源码 MetricsAPI源码

如有特殊报错可贴出返回内容进一步排查。

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 09 '25 04:06 dosubot[bot]

反馈,集群部署成功,感谢

ZQYKing avatar Jun 09 '25 06:06 ZQYKing

反馈,集群部署成功,感谢

@wanwancloud 不好意思, 这块官网的 doc 也很快同步更新一下 @JackyYangPassion

refer:

  • https://github.com/apache/incubator-hugegraph/issues/2265
  • https://github.com/apache/incubator-hugegraph/wiki/HugeGraph-Distributed-(pd%E2%80%90store)-Version-Deployment-Guide (部署配置参考)

另外也欢迎随时改进相关的 doc

imbajin avatar Jun 09 '25 07:06 imbajin

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label

github-actions[bot] avatar Jun 24 '25 21:06 github-actions[bot]