rusty-kaspa icon indicating copy to clipboard operation
rusty-kaspa copied to clipboard

Investigate and document hardware setups that optimize disk write and lifespan

Open coderofstuff opened this issue 10 months ago • 3 comments

Higher BPS will require significantly higher amounts of writes to storage mediums. We should document ways to optimize node setup so as to maximize disk lifespans and performance when we go to higher bps.

coderofstuff avatar Mar 25 '24 20:03 coderofstuff

Been running a comparison for 4 days 23.5 hours and still seeing some nice disk write savings. These were 2 separate VM's on a single Windows 11 Client Hyper-V host. Both VM's running Windows 11 Pro, with 56 GB RAM, 6 virtual CPU's from 12th Gen Intel(R) Core(TM) i5-12600K. Was using PrimoCache software on both with a 36 GB L1 RAM Cache to minimize wear on the underlying SSD, which is a Samsung SSD 980 PRO 2TB.

here's the information from node running latest:

Pruning times: 2024-03-20 20:38:21.829-05:00 [INFO ] Starting Header and Block pruning... 2024-03-20 22:06:12.328-05:00 [INFO ] Header and Block pruning completed: traversed: 491080, pruned 451994

2024-03-21 08:29:34.966-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 09:56:16.567-05:00 [INFO ] Header and Block pruning completed: traversed: 486725, pruned 447496

2024-03-21 20:24:23.978-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 21:55:42.432-05:00 [INFO ] Header and Block pruning completed: traversed: 476662, pruned 437246

2024-03-22 08:32:00.561-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 10:03:31.269-05:00 [INFO ] Header and Block pruning completed: traversed: 474666, pruned 435721

2024-03-22 20:38:30.297-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 22:12:27.052-05:00 [INFO ] Header and Block pruning completed: traversed: 473133, pruned 434034

2024-03-23 08:33:38.980-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 10:08:09.169-05:00 [INFO ] Header and Block pruning completed: traversed: 478005, pruned 438769

2024-03-23 20:22:44.132-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 22:03:44.303-05:00 [INFO ] Header and Block pruning completed: traversed: 498908, pruned 459764

2024-03-24 08:14:37.256-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 09:49:12.724-05:00 [INFO ] Header and Block pruning completed: traversed: 474671, pruned 435798

2024-03-24 20:07:41.664-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 21:41:01.881-05:00 [INFO ] Header and Block pruning completed: traversed: 472693, pruned 433506

2024-03-25 07:58:44.050-05:00 [INFO ] Starting Header and Block pruning... 2024-03-25 09:36:10.700-05:00 [INFO ] Header and Block pruning completed: traversed: 473140, pruned 434227

Performance metrics: 2024-03-25 15:25:53.078-05:00 [TRACE] [perf-monitor] process metrics: RAM: 9406656512 (9.41GB), VIRT: 19268296704 (19.27GB), FD: 4423, cores: 6, total cpu usage: 3.4575 2024-03-25 15:26:03.083-05:00 [TRACE] [perf-monitor] disk io metrics: read: 17984155275801 (18TB), write: 7090310564916 (7TB), read rate: 11818591.963 (12MB/s), write rate: 2565557.277 (3MB/s)

And here's the data from optimization branch: https://github.com/biryukovmaxim/rusty-kaspa/tree/rocksdb-optimizations

Pruning times:

2024-03-20 20:50:50.188-05:00 [INFO ] Starting Header and Block pruning... 2024-03-20 22:55:17.354-05:00 [INFO ] Header and Block pruning completed: traversed: 491080, pruned 451994

2024-03-21 08:31:58.413-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 10:41:09.741-05:00 [INFO ] Header and Block pruning completed: traversed: 486725, pruned 447496

2024-03-21 20:26:27.425-05:00 [INFO ] Starting Header and Block pruning... 2024-03-21 22:33:46.521-05:00 [INFO ] Header and Block pruning completed: traversed: 476662, pruned 437246

2024-03-22 08:33:24.643-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 11:16:26.402-05:00 [INFO ] Header and Block pruning completed: traversed: 474666, pruned 435721

2024-03-22 20:40:42.267-05:00 [INFO ] Starting Header and Block pruning... 2024-03-22 22:52:51.090-05:00 [INFO ] Header and Block pruning completed: traversed: 473133, pruned 434034

2024-03-23 08:36:17.572-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 10:47:14.354-05:00 [INFO ] Header and Block pruning completed: traversed: 478005, pruned 438769

2024-03-23 20:25:07.041-05:00 [INFO ] Starting Header and Block pruning... 2024-03-23 22:44:36.901-05:00 [INFO ] Header and Block pruning completed: traversed: 498908, pruned 459764

2024-03-24 08:15:37.933-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 10:28:23.155-05:00 [INFO ] Header and Block pruning completed: traversed: 474671, pruned 435798

2024-03-24 20:09:47.030-05:00 [INFO ] Starting Header and Block pruning... 2024-03-24 22:20:14.319-05:00 [INFO ] Header and Block pruning completed: traversed: 472693, pruned 433506

2024-03-25 07:59:28.877-05:00 [INFO ] Starting Header and Block pruning... 2024-03-25 10:15:44.295-05:00 [INFO ] Header and Block pruning completed: traversed: 473140, pruned 434227

Performance metrics: 2024-03-25 15:25:45.586-05:00 [TRACE] [perf-monitor] process metrics: RAM: 9713639424 (9.71GB), VIRT: 24021032960 (24.02GB), FD: 3282, cores: 6, total cpu usage: 3.5519 2024-03-25 15:25:45.586-05:00 [TRACE] [perf-monitor] disk io metrics: read: 18964146280014 (19TB), write: 3772184849200 (4TB), read rate: 12274983.905 (12MB/s), write rate: 1175046.086 (1MB/s)

callid0n avatar Mar 25 '24 20:03 callid0n

I'm fully synced (running archival node) on TN11 using the new 256KB rocksdb block size. Going to let it run for a while to see if it falls out of sync or anything strange happens.

I currently have 5 7200 RPM HDD's configured in a Windows Parity Storage Space (essentially RAID 5) using the Powershell below.

New-VirtualDisk -StoragePoolFriendlyName PoolName -FriendlyName vDiskNameHere -ProvisioningType Fixed -ResiliencySettingName Parity -UseMaximumSize -NumberOfColumns 5 -Interleave 256KB|Initialize-Disk -PartitionStyle GPT -PassThru |New-Partition -DriveLetter k -UseMaximumSize |Format-Volume -FileSystem NTFS -NewFileSystemLabel "KasData" -AllocationUnitSize 1024KB -UseLargeFRS -Confirm:$false

So 256KB interleave and 1024KB Windows NTFS Allocation Unit Size Using PrimoCache to create a 5GB RAM write cache and a 50GB SSD Read cache.

and here is the code ChatGPT gave me to alter the file "\database\src\db\conn_builder.rs" to get the 256KB rocksdb block size.


macro_rules! default_opts {
    ($self: expr) => {{
        let mut opts = rocksdb::Options::default();
        if $self.parallelism > 1 {
            opts.increase_parallelism($self.parallelism as i32);
        }

        opts.optimize_level_style_compaction($self.mem_budget);
        let guard = kaspa_utils::fd_budget::acquire_guard($self.files_limit)?;

        // Create BlockBasedOptions and set block size
        let mut block_opts = rocksdb::BlockBasedOptions::default();
        block_opts.set_block_size(256 * 1024);
        opts.set_block_based_table_factory(&block_opts);

        opts.set_max_open_files($self.files_limit);
        opts.create_if_missing($self.create_if_missing);
        Ok((opts, guard))
    }};
}

https://discord.com/channels/599153230659846165/755890250643144788/1223301320769798297

callid0n avatar Mar 29 '24 17:03 callid0n

This config fell out of sync once the underlying size of data sourced from the spinning disks was too large. Still hunting for a spinning disk archival config that works. https://github.com/kaspanet/rusty-kaspa/issues/441#issuecomment-2027512540

callid0n avatar Apr 10 '24 18:04 callid0n