Bug: cannot finish DB compaction on macOS due to open file limit
Describe the bug Block compaction panics due to too many files open.
Electrs version 0.10.0, built from master.
To Reproduce Steps to reproduce the behavior:
- Build from master a1460ec4a90e45c863c29f66932dd91a0aea9f6f with
cargo build --locked --release - Run with
./target/release/electrs --log-filters=DEBUG --skip-block-download-wait - Wait until block compaction is reached
- Error:
thread 'main' panicked at 'DB::put failed: Error { message: "IO error: While open a file for random read: ./db/bitcoin/000213.sst: Too many open files" }', src/db.rs:342:14
Expected behavior I'd expect block compaction to finish
Configuration I don't use a configuration file for this.
Environment variables: ELECTRS_X=Y;...: none
Arguments: --foo: --log-filters=DEBUG --skip-block-download-wait
System running electrs
- Deployment method: manual
- OS name and version (name of distribution and version in case of Linux): macOS Ventura 13.5
Additional context I tried building this with Prometheus support in order to inspect the file descriptor progress, but got this result:
❯ cargo build --locked --release --all-features
Compiling electrs v0.10.0 (/Users/torkel/dev/rust/electrs)
error[E0432]: unresolved import `prometheus::process_collector`
--> src/metrics.rs:6:21
|
6 | use prometheus::process_collector::ProcessCollector;
| ^^^^^^^^^^^^^^^^^ could not find `process_collector` in `prometheus`
For more information about this error, try `rustc --explain E0432`.
error: could not compile `electrs` due to previous error
ulimit -n reports 256, but I'm not sure if this is the correct number for max open files.
Posting logs.
❯ ./target/release/electrs --log-filters=DEBUG --skip-block-download-wait
Starting electrs 0.10.0 on aarch64 macos with Config { network: Bitcoin, db_path: "./db/bitcoin", daemon_dir: "/Users/torkel/.bitcoin", daemon_auth: CookieFile("/Users/torkel/.bitcoin/.cookie"), daemon_rpc_addr: 127.0.0.1:8332, daemon_p2p_addr: 127.0.0.1:8333, electrum_rpc_addr: 127.0.0.1:50001, monitoring_addr: 127.0.0.1:4224, wait_duration: 10s, jsonrpc_timeout: 15s, index_batch_size: 10, index_lookup_limit: None, reindex_last_blocks: 0, auto_reindex: true, ignore_mempool: false, sync_once: false, skip_block_download_wait: true, disable_electrum_rpc: false, server_banner: "Welcome to electrs 0.10.0 (Electrum Rust Server)!", signet_magic: f9beb4d9, args: [] }
[2023-08-15T13:22:58.544Z DEBUG tiny_http] Server listening on 127.0.0.1:4224
[2023-08-15T13:22:58.544Z INFO electrs::metrics::metrics_impl] serving Prometheus metrics on 127.0.0.1:4224
[2023-08-15T13:22:58.544Z INFO electrs::server] serving Electrum RPC on 127.0.0.1:50001
[2023-08-15T13:22:58.544Z DEBUG tiny_http] Running accept thread
[2023-08-15T13:22:58.652Z INFO electrs::db] "./db/bitcoin": 652 SST files, 59.91444729 GB, 5.297402539 Grows
[2023-08-15T13:22:58.652Z DEBUG electrs::db] DB Some(Config { compacted: false, format: 0 })
[2023-08-15T13:22:59.489Z INFO electrs::chain] loading 803298 headers, tip=000000000000000000003b002ef7631e504a76fd0f46a4ed5905e6fa42b17701
[2023-08-15T13:23:00.601Z INFO electrs::chain] chain updated: tip=000000000000000000003b002ef7631e504a76fd0f46a4ed5905e6fa42b17701, height=803298
[2023-08-15T13:23:00.603Z DEBUG bitcoincore_rpc] JSON-RPC request: getblockchaininfo []
[2023-08-15T13:23:00.603Z DEBUG bitcoincore_rpc] JSON-RPC request: getnetworkinfo []
[2023-08-15T13:23:00.604Z DEBUG bitcoincore_rpc] JSON-RPC request: getnetworkinfo []
[2023-08-15T13:23:00.604Z DEBUG bitcoincore_rpc] JSON-RPC request: getblockchaininfo []
[2023-08-15T13:23:00.604Z DEBUG bitcoincore_rpc] JSON-RPC request: getnetworkinfo []
[2023-08-15T13:23:00.604Z DEBUG electrs::p2p] peer version: VersionMessage { version: 70016, services: ServiceFlags(1033), timestamp: 1692105780, receiver: Address {services: ServiceFlags(NONE), address: 0.0.0.0, port: 0}, sender: Address {services: ServiceFlags(NETWORK|WITNESS|NETWORK_LIMITED), address: 0.0.0.0, port: 0}, nonce: 788258135769235039, user_agent: "/Satoshi:24.0.1/", start_height: 803302, relay: true }
[2023-08-15T13:23:00.605Z DEBUG electrs::p2p] got 4 new headers
[2023-08-15T13:23:00.605Z INFO electrs::index] indexing 4 blocks: [803299..803302]
[2023-08-15T13:23:00.605Z DEBUG electrs::p2p] loading 4 blocks
[2023-08-15T13:23:00.751Z DEBUG electrs::index] writing 33610 funding and 27220 spending rows from 19149 transactions, 4 blocks
[2023-08-15T13:23:00.768Z INFO electrs::chain] chain updated: tip=00000000000000000000f04941e8c2596a2986658dcc93077901b2a7f83a57b4, height=803302
[2023-08-15T13:23:00.768Z DEBUG electrs::p2p] got 0 new headers
[2023-08-15T13:23:00.784Z INFO electrs::db] starting config compaction
[2023-08-15T13:23:00.785Z INFO electrs::db] starting headers compaction
[2023-08-15T13:23:01.368Z INFO electrs::db] starting txid compaction
car[2023-08-15T13:26:45.489Z DEBUG electrs::p2p] peer inventory: [Block(0x00000000000000000003e00fc88fd8fcd2fd0cdea5c96b66be51c259189e1af7)]
[2023-08-15T13:26:51.788Z INFO electrs::db] starting funding compaction
[2023-08-15T13:26:52.094Z INFO electrs::db] starting spending compaction
thread 'main' panicked at 'DB::put failed: Error { message: "IO error: While open a file for random read: ./db/bitcoin/000213.sst: Too many open files" }', src/db.rs:342:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2023-08-15T13:26:52.095Z INFO electrs::db] closing DB at ./db/bitcoin
[2023-08-15T13:26:53.125Z DEBUG electrs::p2p] closing p2p_loop thread: no more requests to handle
[2023-08-15T13:26:53.125Z DEBUG electrs::p2p] closing p2p_send thread: no more messages to send
[2023-08-15T13:26:53.125Z DEBUG electrs::p2p] closing p2p_recv thread: connection closed
Managed to get around this by increasing the file limit with ulimit -n LARGE_NUMBER. However I've never had software crash on my machine before due to this. I think it would be reasonable for electrs to try and stay within the allotted limit.
Thanks for reporting this issue! Will reproduce on a Linux machine to collect file descriptor usage metrics.
Additional context I tried building this with Prometheus support in order to inspect the file descriptor progress, but got this result:
❯ cargo build --locked --release --all-features
Compiling electrs v0.10.0 (/Users/torkel/dev/rust/electrs)
error[E0432]: unresolved import `prometheus::process_collector`
--> src/metrics.rs:6:21
|
6 | use prometheus::process_collector::ProcessCollector;
| ^^^^^^^^^^^^^^^^^ could not find `process_collector` in `prometheus`
Unfortunately, process_collector doesn't support MacOS :(
https://github.com/tikv/rust-prometheus/blob/6e81890773ef82e3bcc6c080d406543da1fb8073/src/process_collector.rs#L5
According to Prometheus, electrs uses ~350 file descriptors during compaction:
However I've never had software crash on my machine before due to this.
I assume that the reason behind this is that the initial sync process writes many SST files (containing the index) and then merges them during the full compaction. In order to merge them, RocksDB needs to open all of them - failing if there are more than 256 SST files to be merged.
This issue is expected since the blockchain is growing, and so is the number of files to merge during the full compaction.
We'll probably need to update the docs to suggest increasing ulimit -n to a larger value (on my machine it is set to 1024).
Right, interesting, thanks! Probably a good idea to update the docs, if it isn't possible to get RocksDB to be stingier with the files. For reference, my ulimit -n is 256, which I believe is the default on macOS