Local Node will stop working after a few hours uptime
Hello,
I'm looking for a way to run the hiero local node continuously without losing the data.
It seems that even following the guidelines on ram and storage the system will normally fall over after 2 hours, sometimes sooner.
I have had the system running for 24 hours, but the data became out of sync, so the block explorer and API didn't contain the recent transactions.
Any help appreciated, I've tried things like moving the working directory, using a mixture of --full and not.
Hey @madebymatty config flags should make little difference to the persistent operation of the local node. Changing the directory should not have effect either. Can you provide more information about your hardware platform and setup. Also please check the docker logs for the mirror node importer component for errors. The issue you're describing points towards and issue with it.
Sure,
Hardware is AWS
- m6i.2xlarge
- 64GB storage
- 8 vCPUs
- ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-20250115
Build using hedera start -d --full
docker logs mirror-node-importer
A bun ch of these errors after the node explorer stops getting updated
2025-02-22T05:22:46.121Z ERROR scheduling-3 c.h.m.i.d.r.RecordFileDownloader None of the data files could be verified, signatures: [StreamFileSignature(filename=2025-02-22T05_01_41.132867268Z.rcd_sig, node=0, signatureType=SHA_384_WITH_RSA, status=CONSENSUS_REACHED, streamType=RECORD, version=6)] 2025-02-22T05:22:46.641Z ERROR parallel-7 c.h.m.i.d.r.RecordFileDownloader Error downloading signature files for node 0 com.hedera.mirror.importer.exception.SignatureFileParsingException: Error reading signature file at com.hedera.mirror.importer.reader.signature.CompositeSignatureFileReader.read(CompositeSignatureFileReader.java:57) at com.hedera.mirror.importer.downloader.Downloader.lambda$downloadAndParseSigFiles$3(Downloader.java:230) at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:208) at reactor.core.publisher.FluxLimitRequest$FluxLimitRequestSubscriber.onNext(FluxLimitRequest.java:99) at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)...
I'm experiencing similar issues while running hedera start --dev --verbose=trace --enable-block-node, the node will stop producing blocks after a few minutes, few dozen at max. Tried both on my M2 Pro Mac and a Linux server with 32GB of RAM, 8 vCPUs, 240 GB SSD.
My docker logs mirror-node-importer shows these kinds of errors that are being logged constantly:
2025-06-24T11:54:53.087Z INFO scheduling-5 o.h.m.i.d.r.RecordFileDownloader No new signature files to download after file: 2025-06-24T11_30_58.048069244Z.rcd.gz. Retrying in 0.5 s
2025-06-24T11:54:53.589Z INFO scheduling-1 o.h.m.i.d.r.RecordFileDownloader No new signature files to download after file: 2025-06-24T11_30_58.048069244Z.rcd.gz. Retrying in 0.5 s
2025-06-24T11:54:54.090Z INFO scheduling-1 o.h.m.i.d.r.RecordFileDownloader No new signature files to download after file: 2025-06-24T11_30_58.048069244Z.rcd.gz. Retrying in 0.5 s
2025-06-24T11:54:54.729Z WARN sdk-async-response-1-111 o.h.m.i.d.p.CompositeStreamFileProvider Attempt #1 failed: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.
2025-06-24T11:54:54.730Z INFO scheduling-6 o.h.m.i.d.r.RecordFileDownloader No new signature files to download after file: 2025-06-24T11_30_58.048069244Z.rcd.gz. Retrying in 0.5 s
Hey @rista404 , can you try without --enable-block-node. There are some problems, when running the node for long time with block-node enabled.
@georgi-l95 the proposed fix seems to be working and after a few hours the local node is still producing blocks. Thanks!