bladebit icon indicating copy to clipboard operation
bladebit copied to clipboard

128GB mode Failed to Write Slice Error win11 only occurs on SSDs

Open BrandtH22 opened this issue 1 year ago • 21 comments

This issue was reported by Delerium in discord: https://discord.com/channels/1034523881404370984/1102690350218354920/1179030557271793715

Hi guys - im using Win 11 on Bladebit 3.1 with 128GB ram and a 8GB Nvidia RTX 2080 and regardless of setting I keep getting "Failed to write slice on F://p1unsortedx-p11pairs-3lp-p3-lmap.tmp errror 0" - The temp drive is a local 1TB SSD.... what am I doing wrong?

Command used: bladebit_cuda -f xxxx -c xxxx -n 1 --compress 2 cudaplot --disk-128 -t1 F:/ F:/

Notes:

  • Issue occurs only when setting an SSD as a temp drive (HDDs work without issue)
  • Tried reformatting the SSD (even tried multiple format types)
  • Tried using the standalone bladebit and also the integrated bladebit (integrated has error of STDERR: Failed to write slice)
  • Tried multiple different SSDs (3 samsung pro evo 1 TB)

Full CLI of failed run:

Bladebit Chia Plotter
Version      : 3.1.0
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: msvc 19.29.30152

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : f
 Pool contract address : f
 Compression Level     : 2
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
 Host RAM            : 127 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce RTX 2080
 CUDA Compute Capability   : 7.5
 SM count                  : 46
 Max blocks per SM         : 16
 Max threads per SM        : 1024
 Async Engine Count        : 2
 L2 cache size             : 4.00 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 8.00 GB
  Free                     : 6.96 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 92412135120  bytes ( 88131.08  MiB or 86.07  GiB )
Intermediate RAM required : 4385218560   bytes ( 4182.07   MiB or 4.08   GiB )
Host RAM required         : 28420603904  bytes ( 27104.00  MiB or 26.47  GiB )
Total Host RAM required   : 120832739024 bytes ( 115235.08 MiB or 112.53 GiB )
GPU RAM required          : 6167756800   bytes ( 5882.03   MiB or 5.74   GiB )
Allocating buffers...
Done.

Generating plot 1 / 1: fbbb9cf468011ec5123479b0742f2dea31874c57a4f72d074a17b6b4ddc1be5d
Plot temporary file: F:/plotdone/plot-k32-c02-2023-11-28-20-36-fbbb9cf468011ec5123479b0742f2dea31874c57a4f72d074a17b6b4ddc1be5d.plot.tmp

Generating F1
Finished F1 in 12.17 seconds.
Table 2 completed in 37.99 seconds with 4294967296 entries.

Fatal Error:
Failed to write slice on 'F://p1unsortedx-p1lpairs-p3lp-p3-lmap.tmp' with error 0.

Full CLI of completed run using an HDD (attached): chialog.txt

BrandtH22 avatar Nov 29 '23 18:11 BrandtH22

I'm nice and active on this Harold-b so if you need me to test alternative settings to root cause this (or experimental releases) please just reach out. (This is the original raiser of the issue - Delerium on Discord).

GetStreamlined avatar Nov 30 '23 13:11 GetStreamlined

Thank you, @GetStreamlined Do you get the same issue w/ the SSD if you use --no-direct-io?

It's a global option that should come somewhere before cudaplot

harold-b avatar Nov 30 '23 18:11 harold-b

@harold-b Sadly same issue. Command used:

bladebit_cuda -f redacted-c redacted -n 1 --compress 5 --no-direct-io cudaplot --disk-128 -t1 F:/ F:/

Output:

Bladebit Chia Plotter
Version      : 3.1.0
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: msvc 19.29.30152

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : redacted
 Pool contract address : redacted
 Compression Level     : 5
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
 Host RAM            : 127 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce RTX 2080
 CUDA Compute Capability   : 7.5
 SM count                  : 46
 Max blocks per SM         : 16
 Max threads per SM        : 1024
 Async Engine Count        : 2
 L2 cache size             : 4.00 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 8.00 GB
  Free                     : 6.96 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 92412135120  bytes ( 88131.08  MiB or 86.07  GiB )
Intermediate RAM required : 4385218560   bytes ( 4182.07   MiB or 4.08   GiB )
Host RAM required         : 28420603904  bytes ( 27104.00  MiB or 26.47  GiB )
Total Host RAM required   : 120832739024 bytes ( 115235.08 MiB or 112.53 GiB )
GPU RAM required          : 6167756800   bytes ( 5882.03   MiB or 5.74   GiB )
Allocating buffers...
Done.

Generating plot 1 / 1: bf98e067348b10a1c3e431deea13573e25606eb1a3a5404ac45cfcf004c1b101
Plot temporary file: F:/plot-k32-c05-2023-11-30-23-10-bf98e067348b10a1c3e431deea13573e25606eb1a3a5404ac45cfcf004c1b101.plot.tmp

Generating F1
Finished F1 in 12.39 seconds.
Table 2 completed in 36.91 seconds with 4294967296 entries.

Fatal Error:
Failed to write slice on 'F://p1unsortedx-p1lpairs-p3lp-p3-lmap.tmp' with error 0.

GetStreamlined avatar Nov 30 '23 23:11 GetStreamlined

Also conducted an iotest:

C:\Chia\Chia_Plotting\Plotting>bladebit_cuda iotest F:/
Size   : 4096.00 MiB
Cache  : 0.00 MiB
Threads: 1
Passes : 1
Performing test with file F:/
Allocating buffer...

Writing...
Wrote 4096.00 MiB in 2.03 seconds @ 2016.74 MiB/s (1.97 GiB/s) or 2115 MB/s (2.11 GB/s).

Reading...
Read 4096.00 MiB in 1.49 seconds @ 2758.25 MiB/s (2.69 GiB/s) or 2892 MB/s (2.89 GB/s)

GetStreamlined avatar Nov 30 '23 23:11 GetStreamlined

I also switched the video card to a NVIDIA GeForce GTX 1660 Ti to do more trouble shooting. Sadly same output.

Bladebit Chia Plotter
Version      : 3.1.0
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: msvc 19.29.30152

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : xxxx
 Pool contract address : xxxx
 Compression Level     : 5
 Benchmark mode        : disabled

[Bladebit CUDA Plotter]
 Host RAM            : 127 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce GTX 1660 Ti
 CUDA Compute Capability   : 7.5
 SM count                  : 24
 Max blocks per SM         : 16
 Max threads per SM        : 1024
 Async Engine Count        : 2
 L2 cache size             : 1.50 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 6.00 GB
  Free                     : 5.02 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 92412135120  bytes ( 88131.08  MiB or 86.07  GiB )
Intermediate RAM required : 4385218560   bytes ( 4182.07   MiB or 4.08   GiB )
Host RAM required         : 28420603904  bytes ( 27104.00  MiB or 26.47  GiB )
Total Host RAM required   : 120832739024 bytes ( 115235.08 MiB or 112.53 GiB )
GPU RAM required          : 6167756800   bytes ( 5882.03   MiB or 5.74   GiB )
Allocating buffers...
Done.

Generating plot 1 / 1: e49016c42914b4a4f527bdd2abaf6817e7f344acce768e4cd0a09e257c4c3ae0
Plot temporary file: F:/plot-k32-c05-2023-12-01-16-59-e49016c42914b4a4f527bdd2abaf6817e7f344acce768e4cd0a09e257c4c3ae0.plot.tmp

Generating F1
Finished F1 in 13.62 seconds.
Table 2 completed in 79.00 seconds with 4294938662 entries.

Fatal Error:
Failed to write slice on 'F://p1unsortedx-p1lpairs-p3lp-p3-lmap.tmp' with error 0.

GetStreamlined avatar Dec 01 '23 17:12 GetStreamlined

You run Terminal as Admin?

teamwest93 avatar Dec 01 '23 17:12 teamwest93

You run Terminal as Admin?

I did yes and also tried without.

GetStreamlined avatar Dec 01 '23 17:12 GetStreamlined

Additionally tried in powershell (with and without administrator). Same issue.

GetStreamlined avatar Dec 01 '23 20:12 GetStreamlined

What abot beta1 or rc1 versions?

teamwest93 avatar Dec 02 '23 06:12 teamwest93

What abot beta1 or rc1 versions?

sadly they give a slightly different error (Failed to open plot file with error: 3)

GetStreamlined avatar Dec 02 '23 08:12 GetStreamlined

@harold-b is there any update on this issue - im keen to get plotting as I dont want to go to Gigahorse.

GetStreamlined avatar Dec 15 '23 09:12 GetStreamlined

I wonder if this is related to block size. Would you mind running diskplot on those target SSDs to see what block size bladebit reports (you don't have to make a plot, it should just report the block size for the temp directories).

harold-b avatar Dec 16 '23 02:12 harold-b

@harold-b

Here is the result of the diskplot using the SSD:

[Bladebit Disk Plotter]
 Heap size      : 3.37 GiB ( 3452.88 MiB )
 Cache size     : 0.00 GiB ( 0.00 MiB )
 Bucket count   : 256
 Alternating I/O: false
 F1  threads    : 16
 FP  threads    : 16
 C   threads    : 16
 P2  threads    : 16
 P3  threads    : 16
 I/O threads    : 1
 Temp1 block sz : 16384
 Temp2 block sz : 16384
 Temp1 path     : F:/
 Temp2 path     : F:/
 I/O metrices enabled.
 Allocating memory

If I used the HDD instead its different:

Temp1 block sz : 4096

GetStreamlined avatar Dec 17 '23 17:12 GetStreamlined

Thanks for the info! So it does look like it is block-size related. As a workaround for the time being you can try resetting the SSDs w/ 4k block size while this is resolved

harold-b avatar Dec 17 '23 19:12 harold-b

Thanks for the info! So it does look like it is block-size related. As a workaround for the time being you can try resetting the SSDs w/ 4k block size while this is resolved

From research on Samsung Pro EVO SSD's you cannot change the block size so looks like I'm stuck waiting for a resolution :(

GetStreamlined avatar Dec 18 '23 10:12 GetStreamlined

Hi @harold-b - hope you had a lovely Xmas and New Year. Do you have a rough timescale of when this will be resolved please?

James

GetStreamlined avatar Jan 04 '24 14:01 GetStreamlined

I've started up work on bladebit stuff this week. I don't have a timeframe but hopefully this one won't take much since we know exactly where the issue lies. I certainly haven't forgotten about you

haorldbchi avatar Jan 04 '24 19:01 haorldbchi

@harold-b fantastic! If you need me to test a beta release let me know :)

GetStreamlined avatar Jan 04 '24 19:01 GetStreamlined

I have the same problem ...

sonosergio avatar Mar 29 '24 07:03 sonosergio

I have the same problem ...

I gave up waiting so I tried Gigahorse. No problem there.

GetStreamlined avatar Mar 30 '24 09:03 GetStreamlined

Same here. Probably it's better to switch to something else.

piotr-nowicki avatar Apr 17 '24 19:04 piotr-nowicki