chia-plotter
chia-plotter copied to clipboard
reduced plotting performance after upgrading from 0.0.5 to 0.1.1
After upgrading from chia_plot verision 0.0.5 to 0.1.1 (on Windows), I saw a ~50% decrease in speed on the same system with the same configuration.
command used with 0.0.5:
chia_plot --threads 15 --tmpdir D:\ --tmpdir2 D:\ --farmerkey ... --poolkey ...
command used with 0.1.1 is the same as above except with --contract instead of --poolkey
the D:\ drive here is a single SSD. The default value of 256 buckets was not changed.
I also tested 0.1.1 again with more threads (--threads 30) and only saw minimal improvements. Some estimated average times:
| phase | v0.0.5 (15 threads) | v0.1.1 (15 threads) | v0.1.1 (30 threads) |
|---|---|---|---|
| 1 | 1100s | 1700s | 1400s |
| 2 | 500s | 800s | 700s |
| 3 | 480s | 1800s | 1800s |
| 4 | 170s | 170s | 170s |
| total | 2250s | 4470s | 4070s |
system:
- Ryzen 3950X
- 64GB RAM (4x16GB) 3200MHz
- Intel P4600 1.8TB SSD (used for both temp1 and temp2; est. 1600MB/s write, 3000MB/s read speeds)
From the looks of the Task Manager, there is a lot of unused system resources while plotting with 0.1.1, whereas with 0.0.5 chia_plot was frequently using nearly 100% of the available CPU time and SSD bandwidth.
I understand that changes have been made since version 0.0.5 that may have improved performance on some Xeon and Threadripper test systems, but it seems to have greatly hurt performance in this case.
Maybe we could get those new upgrades toggled on/off from the command line? As it stands right now, version 0.1.1 is required in order to use the new Chia pooling protocol, which also means all the old plots made with 0.0.5 will need to be re-created under 0.1.1 at this reduced speed.
I'm also experiencing similar performance issues. My 3700x with 2x NVME would finish in 3300s-3500s but now it's ~4000s+
worth noting that at ~75min/plot, I am getting about 19 plots/day, which is actually less than I was able to get with the official Chia plotter (maxed around 26 plots/day), which is not a good situation for mad max plotter to be in as an alternative plotter :(
Previously with version 0.0.5 I was getting about 40 plots/day on this system
I'm experiencing the same issue using a Ryzen 5900x.
Plot hardware: Ryzen 5900x 128gb ddr 4 3600 ram 2 x 2tb firecuda 520 nvme's in raid 0
Plotter config 22 threads 256 & 128 buckets t1: Nvme's raid0 drive t2: ramdisk
Plot times are varying a lot. Anywhere between 40 minutes and 85 minutes. I've tried loads of different settings, it seems like the system is underperforming? I've been trawling the internet and other people with similar configs seem to be having the same issue.
Something weird seems to be going on with the temp directories. Even though -G is not set, it seems to be alternating the drives?
make sure to trim your SSDs regularly with 'sudo fstrim -v /mnt/ssdpath/' and also mount them (and any SMR HDDs) with the discard option.
make sure to trim your SSDs regularly with 'sudo fstrim -v /mnt/ssdpath/' and also mount them (and any SMR HDDs) with the discard option.
I have trim enabled and and have also turned off indexing and write cache buffer, with CPU is set to realtime priority.
I just can't get the system to consistently write 30-40 minute plots.
is this all on windows?
is this all on windows?
Hey - yes, I am on windows10.
A commenter on issue #786 cut their plotting times in half by installing ubuntu - so looks like a windows issue?
try this version https://github.com/stotiks/chia-plotter/releases/download/v0.1.1/chia_plot_0.1.2a.zip
make sure to trim your SSDs regularly with 'sudo fstrim -v /mnt/ssdpath/' and also mount them (and any SMR HDDs) with the discard option.
When you mount with discard you don't need to explicitly trim.
try this version https://github.com/stotiks/chia-plotter/releases/download/v0.1.1/chia_plot_0.1.2a.zip
Thanks - testing now, will let you know how it goes after first plot
try this version https://github.com/stotiks/chia-plotter/releases/download/v0.1.1/chia_plot_0.1.2a.zip
Giving this a shot as well.
try this version https://github.com/stotiks/chia-plotter/releases/download/v0.1.1/chia_plot_0.1.2a.zip
Will try as well because 3 of 4 PC dropped speed on Windows 10 for approximetly 20-60%.
Number of threads: 22 Number of Buckets P1: 256 Number of Buckers P3+P4: 256 n 5
CPU Ryzen 5900x Ram 128gb T1: 2 x 2tb nvme firecuda's raid 0 T2: 115gb ram risk
Plot 1 Phase 1: 1018s Phase 2: 540s Phase 3: 1075s Phase 4: 87s Total Time: 2719s (45 minutes)
Only 1 plot written so far, but doesn't seem much different.
@Mattchew86, maybe NVME overheating or something else Here are my results with v0.1.1
AMD Ryzen 7 5800X 64GB@3600Mhz T1: Gigabyte AORUS M.2 Gen4 PCIe X4 NVMe 2TB T2: Gigabyte AORUS M.2 Gen4 PCIe X4 NVMe 2TB
Crafting plot 67 out of 145 Process ID: 3612 Number of Threads: 16 Number of Buckets P1: 2^9 (512) Number of Buckets P3+P4: 2^8 (256) Phase 1 took 994.423 sec Phase 2 took 425.397 sec Phase 3 took 504.534 sec, wrote 21872348936 entries to final plot Phase 4 took 56.2946 sec, final plot size is 108806383894 bytes Total plot creation time was 1980.75 sec (33.0126 min)
@stotiks What OS are you on? I'm on Windows10 Pro 64-bit.
I don't think it's linked to the NVME's, as:
-
The temperature for both drives is showing as 40oC in crystal disk and remains constant throughout the whole plotting process and they have their own fan and thermal paste.
-
I have another 1tb WD drive that I use for the OS and I tried that for plotting and no difference.
-
There seems to be little difference between plotting only using the NVME's vs using T1 & T2 with t2 as a ram disk.
I have had a plot on v0.1.1 that has plotted in around 35 minutes, so something isn't right.
There seems to be an issue at phase 3.
Plot 2 Phase 1: 1079s Phase 2: 818s Phase 3: 1480s Phase 4: 91s Total Time: 3468s (58 minutes)
I am on Ubuntu and im experiencing the same issue.
Previous version: 20 mins Current version: 60 mins
@stotiks Just for completeness- here are the results of the first 4of my plots using v 0.1.2a
Plot 1 Phase 1: 1018s Phase 2: 540s Phase 3: 1075s Phase 4: 87s Total Time: 2719s (45 minutes)
Plot 2 Phase 1: 1079s Phase 2: 818s Phase 3: 1480s Phase 4: 91s Total Time: 3468s (58 minutes)
Plot 3 Phase 1: 1195s Phase 2: 880s Phase 3: 1419s Phase 4: 78s Total Time: 3571s (60 minutes)
Plot 4 Phase 1: 1057s Phase 2: 709s Phase 3: 1307s Phase 4: 79s Total Time: 3153s (53 minutes)
@vvavepacket @ditaker @aj10017 @stevekm
Is your temp plotting drive in RAID?
@vvavepacket @ditaker @aj10017 @stevekm
Is your temp plotting drive in RAID?
Me not. I have 4PCs. All plotting using M2 SSD 1TB. Speed +-2500. All have 4*8GB RAM. 3PCs with 20 Thread Cores (intel 10900X) and 1PC with 8Thread core intel 9700k if not mistakes).
So... 3 or 5 days ago they created plots approximately from 5000sec to 7000sec. Now: PC with 9700k (weakest) +-6000 sec (+- no changed) PC with 10900x +-10000sec (was 5000-6000sec average before)
Maybe I did smth wrong but I updated to Chia 1.2, then downloaded new plot.exe file and putted it in directory (changed old version 1.1mb to new version 1.8mb). I didn't changed nothing else.
No any idea why it happens and why worst of 4 PCs works slower then PC with more threads and better RAM :D.
Will have free 4 hours tomorrow and will try again If there will be not any resolution before from somebody with the same problem.
Here is last timing of one PC: Phase 1: 6000sec Phase 2: 2166sec Phase 3: 4992sec Phase 4: 193sec Total: 13623sec (before this PC was made it for 5500sec +-. Settings: r -18, u -7 When settings was: r -18, I -8 total time was +-10000sec PC: 10900X (10 cores, 20 threads), 32GB ram, 1TB M2 SSD +-2500 write/read speed.
Well I swapped to my OS NVME 1tb drive and two plots in a row sub 40 minutes - so that would suggest my NVME raid drive is causing an issue for me - nothing else changed
@stotiks
try this version https://github.com/stotiks/chia-plotter/releases/download/v0.1.1/chia_plot_0.1.2a.zip
Unfortunately this version actually runs slower for me, avg 4700s with 15 threads. Especially phase 3 took avg 1950s.
what I see in my logs is that plot creation time remains the same (50 min on my machine), but total time for a single plot (creation + copy) increased (2 hours on my machine). Looks like the process of copying is not dedicated now and next plot creation is suspended until copying of the previous is finished.
Total plot creation time was 3386.29 sec (56.4382 min)
Started copy to F:\plots\plot-k32-2021-07-09-23-19-0abcfa6e2b8105fb92eac0299f274251b3a4aa54169a1f3261db75a196c11866.plot
Copy to F:\plots\plot-k32-2021-07-09-21-38-d80e5d3a9e28fcfdedafa6cbc38132109a9bf48e77d026bc19f3341aff80e5f4.plot finished, took 6174.27 sec, 16.8107 MB/s avg.
How do we start the file copy process asynchronously? Such that it copies the file to background while start the next plot immediately
How do we start the file copy process asynchronously? Such that it copies the file to background while start the next plot immediately
It is already doing that as you describe.
Hopefully this helps someone. I was using the previous version to plot in around 9000 seconds (10 year old i5 2500k) but since using 0.1.1 with the -c function dropped times to around 13000 - 14000 seconds. Tried a few things including 0.1.2a which was the same speed, possibly a bit slower.
However after MS forcing a windows update last night my speeds are back to close to normal, last 3 plots using 0.1.1 have been 9600 seconds, 9800 and now the last one was 9700 so a little slower than before (10%ish) but close enough for me. Maybe something to do with Windows redistributable packages?
I am experiencing the same thing on a 5800x across multiple brands of NVMe using the W10 version. Phase 1 and 3 both started spiking in times once I started NFT plotting with stotik's 0.1.1 version. I am not experiencing any heat throttling either. It seems like these times started getting long specifically after I updated.
Specifically it seems like in the P3-2 in Phase 3 have way longer periods of time now and my CPU barely crosses 30%, and normally sites in the 10-20% ranges.
Secondly, Phase 1 the calculation time just got a bit longer. But I do not notice the same CPU % correlation with Phase 3-2 instances.
EDIT: I am noticing the CPU drops in phase 1 as well. But they are more drastic in Phase 3-2 instances.
5800X stock Tomahawk b550 32GB Ram @3200 Firecuda 1TB Gen 4 (Used for all temp writing)
are we sure this is specific to Windows? Seems like Linux users are also reporting performance drops?
Re: Windows updates; I am running Windows 10 Pro 21H1 with all updates applied and still getting the reduced plot rates
idk what people expect from el cheapo nvme drives - check your nvme saturation in task manager and I promise, it is 100% all the time when the CPU is in 10-20% range. I plot with an enterprise HPE NVMe with 29PB TBW (TLC) ($2000) and it can barely keep up with a 5900X.
EDIT: I am noticing the CPU drops in phase 1 as well. But they are more drastic in Phase 3-2 instances.
5800X stock Tomahawk b550 32GB Ram @3200 Firecuda 1TB Gen 4 (Used for all temp writing)
idk what people expect from el cheapo nvme drives - check your nvme saturation in task manager and I promise, it is 100% all the time when the CPU is in 10-20% range. I plot with an enterprise HPE NVMe with 29PB TBW (TLC) ($2000) and it can barely keep up with a 5900X.
EDIT: I am noticing the CPU drops in phase 1 as well. But they are more drastic in Phase 3-2 instances. 5800X stock Tomahawk b550 32GB Ram @3200 Firecuda 1TB Gen 4 (Used for all temp writing)
I understand why you might think this if you've been plotting on 2k enterprise grade equipment. But the fact is it has nothing to do with the hardware as nothing has changed. Do you think everyone here just decided to all change out their NVMe drives right when the contract plotter came out thus increasing all their times? The point is even though there were no configuration changes from before and now, timings still increased seemingly for no reason.
And no, my NVMe drives stop being saturated in the same places I mentioned the CPU activity lowers. Phase 1, and Phase 3-2 iterations. And this happens on regardless if I slap in the Firecudas I use primarily, or the cheapo 60 dollar WD blues I have. All their timings have increased by 40-60% due to Phase 1 and 3-2.
Here's what my NVMe activity looks like specifically in Phase 3-2 iterations since my plotter happened to be in it when I was posting this. It crashes to sometimes single digits. Then ramps back up to 100% in 3-1 iterations. This happens on every NVMe I try.
And here is what the activity time looks like once it hits Phase 3-1 iterations
I have downloaded and tested 0.0.5 as well and the issue is still happening. Now I am suspicious this is related to some kind of Windows update that happened at roughly the same time as the contract plotter that is causing some bottlenecks in the plotter.
To note, my plotter is on 21H1 with the most recently KB updates.