bladebit icon indicating copy to clipboard operation
bladebit copied to clipboard

Fatal Error on bladebit_cuda

Open ArigornStrider opened this issue 2 years ago • 5 comments

Ubuntu 22.04 on a Dell R720, dual E5-2697v2, 256GB RAM, 3x FusionIO 1.6TB sx350 in btrfs RAID0, running chia_plot_copy from MadMax over 10Gbps network to farmer, and on the third plot in, bladebit_cuda (downloaded binary from downloads.chia.net) crashed with the following message. Let me know what I left out that would be helpful for debugging. I'm wondering if there is insufficient RAM in the system for the plotter and chia_plot_copy to both run at the same time?

*** Panic!!! *** Fatal Error: Failed to write to plot with error 5: ./bladebit_cuda(+0xcf8cb)[0x55a6a165d8cb] ./bladebit_cuda(+0xcf0af)[0x55a6a165d0af] ./bladebit_cuda(+0xbdb5e)[0x55a6a164bb5e] ./bladebit_cuda(+0xbe510)[0x55a6a164c510] ./bladebit_cuda(+0xd062d)[0x55a6a165e62d] /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f42d3445b43] /lib/x86_64-linux-gnu/libc.so.6(+0x126a00)[0x7f42d34d7a00]

Edit: Subsequent errors after the initial error:

*** Panic!!! *** Fatal Error: Failed to write to plot with error 5: ./bladebit_cuda(+0xcf8cb)[0x557d1f28b8cb] ./bladebit_cuda(+0xcf0af)[0x557d1f28b0af] ./bladebit_cuda(+0xbdb5e)[0x557d1f279b5e] ./bladebit_cuda(+0xbe510)[0x557d1f27a510] ./bladebit_cuda(+0xd062d)[0x557d1f28c62d] /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7faf38ab4b43] /lib/x86_64-linux-gnu/libc.so.6(+0x126a00)[0x7faf38b46a00] CUDA error: 4 (0x4 ) cudaErrorCudartUnloading : driver shutting down

*** Panic!!! *** Fatal Error: CUDA error cudaErrorCudartUnloading : driver shutting down. ./bladebit_cuda(+0xcf8cb)[0x557d1f28b8cb] ./bladebit_cuda(+0xcf0af)[0x557d1f28b0af] ./bladebit_cuda(+0x5217a)[0x557d1f20e17a] ./bladebit_cuda(+0x199ff)[0x557d1f1d59ff] ./bladebit_cuda(+0x1cf58)[0x557d1f1d8f58] ./bladebit_cuda(+0x18245)[0x557d1f1d4245] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7faf38a49d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7faf38a49e40] ./bladebit_cuda(+0x1974e)[0x557d1f1d574e]

ArigornStrider avatar Feb 09 '23 06:02 ArigornStrider

我的问题是:Fatal Error: Failed to open plot file with error: 3 不知道该怎么解决

caodaye avatar Feb 11 '23 18:02 caodaye

Probably the same issue?

Final plot table pointers:
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 1:                0 ( 0x0000000000000000 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 2:                0 ( 0x0000000000000000 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 3:       1290677972 ( 0x000000004cee2ed4 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 4:      12036719472 ( 0x00000002cd71c370 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 5:      26395796022 ( 0x00000006254fe236 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 6:      41487264447 ( 0x00000009a8d56abf )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 7:      58938204372 ( 0x0000000db8fda0d4 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  C 1    :          1048576 ( 0x0000000000100000 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  C 2    :          2765796 ( 0x00000000002a33e4 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  C 3    :          2765972 ( 0x00000000002a3494 )
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]: Final plot table sizes:
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 1: 0.00 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 2: 0.00 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 3: 10248.22 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 4: 13693.88 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 5: 14392.35 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 6: 16642.51 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  Table 7: 16888.40 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  C 1    : 1.64 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  C 2    : 0.00 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]:  C 3    : 1228.25 MiB
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]: Generating plot 8: f26fa6a7e2501d31b83d3ca9a3484177ad94d2613fda4b34daf84835474e3e6b
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]: Plot temporary file: /mnt/data/plot-k32-c09-2023-02-11-21-30-f26fa6a7e2501d31b83d3ca9a3484177ad94d2613fda4b34daf84835474e3e6b.plot.tmp
Feb 11 21:30:48 hp-plotter01 bladebit_cuda[4991]: Generating F1
Feb 11 21:30:51 hp-plotter01 bladebit_cuda[4991]: Finished F1 in 2.83 seconds.
Feb 11 21:30:57 hp-plotter01 bladebit_cuda[4991]: Table 2 completed in 6.16 seconds with 4294930927 entries.
Feb 11 21:31:07 hp-plotter01 bladebit_cuda[4991]: Table 3 completed in 10.74 seconds with 4294843371 entries.
Feb 11 21:31:30 hp-plotter01 bladebit_cuda[4991]: Table 4 completed in 22.12 seconds with 4294684284 entries.
Feb 11 21:31:48 hp-plotter01 bladebit_cuda[4991]: Table 5 completed in 18.56 seconds with 4294396975 entries.
Feb 11 21:32:04 hp-plotter01 bladebit_cuda[4991]: Table 6 completed in 15.51 seconds with 4293816053 entries.
Feb 11 21:32:14 hp-plotter01 bladebit_cuda[4991]: Table 7 completed in 10.64 seconds with 4292570364 entries.
Feb 11 21:32:14 hp-plotter01 bladebit_cuda[4991]: Finalizing Table 7
Feb 11 21:32:20 hp-plotter01 bladebit_cuda[4991]: Finalized Table 7 in 5.68 seconds.
Feb 11 21:32:20 hp-plotter01 bladebit_cuda[4991]: Completed Phase 1 in 92.24 seconds
Feb 11 21:32:23 hp-plotter01 bladebit_cuda[4991]: Marked Table 6 in 2.95 seconds.
Feb 11 21:32:26 hp-plotter01 bladebit_cuda[4991]: Marked Table 5 in 2.58 seconds.
Feb 11 21:32:28 hp-plotter01 bladebit_cuda[4991]: Marked Table 4 in 2.47 seconds.
Feb 11 21:32:28 hp-plotter01 bladebit_cuda[4991]: Completed Phase 2 in 8.00 seconds
Feb 11 21:32:28 hp-plotter01 bladebit_cuda[4991]: Compressing Table 3 and 4...
Feb 11 21:32:34 hp-plotter01 bladebit_cuda[4991]:  Step 1 completed step in 5.71 seconds.
Feb 11 21:32:41 hp-plotter01 bladebit_cuda[4991]:  Step 2 completed step in 7.00 seconds.
Feb 11 21:32:41 hp-plotter01 bladebit_cuda[4991]: Completed table 3 in 12.71 seconds with 3465670903 / 4294684284 entries ( 80.70% ).
Feb 11 21:32:41 hp-plotter01 bladebit_cuda[4991]: Compressing tables 4 and 5...
Feb 11 21:32:47 hp-plotter01 bladebit_cuda[4991]:  Step 1 completed step in 6.02 seconds.
Feb 11 21:32:57 hp-plotter01 bladebit_cuda[4991]:  Step 2 completed step in 9.93 seconds.
Feb 11 21:33:04 hp-plotter01 bladebit_cuda[4991]:  Step 3 completed step in 7.25 seconds.
Feb 11 21:33:04 hp-plotter01 bladebit_cuda[4991]: Completed table 4 in 23.21 seconds with 3532255459 / 4294396975 entries ( 82.25% ).
Feb 11 21:33:04 hp-plotter01 bladebit_cuda[4991]: Compressing tables 5 and 6...
Feb 11 21:33:10 hp-plotter01 bladebit_cuda[4991]:  Step 1 completed step in 6.09 seconds.
Feb 11 21:33:20 hp-plotter01 bladebit_cuda[4991]:  Step 2 completed step in 10.19 seconds.
Feb 11 21:33:28 hp-plotter01 bladebit_cuda[4991]:  Step 3 completed step in 7.54 seconds.
Feb 11 21:33:28 hp-plotter01 bladebit_cuda[4991]: Completed table 5 in 23.82 seconds with 3712380165 / 4293816053 entries ( 86.46% ).
Feb 11 21:33:28 hp-plotter01 bladebit_cuda[4991]: Compressing tables 6 and 7...
Feb 11 21:33:34 hp-plotter01 bladebit_cuda[4991]:  Step 1 completed step in 6.10 seconds.
Feb 11 21:33:45 hp-plotter01 bladebit_cuda[4991]:  Step 2 completed step in 11.10 seconds.
Feb 11 21:33:53 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Command buffer full. Waiting for commands.
Feb 11 21:33:53 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Waited 0.000000 seconds for a Command to be available.
Feb 11 21:33:53 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Command buffer full. Waiting for commands.
Feb 11 21:33:59 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Waited 6.440000 seconds for a Command to be available.
Feb 11 21:34:00 hp-plotter01 bladebit_cuda[4991]:  Step 3 completed step in 14.78 seconds.
Feb 11 21:34:00 hp-plotter01 bladebit_cuda[4991]: Completed table 6 in 31.98 seconds with 4292570364 / 4292570364 entries ( 100.00% ).
Feb 11 21:34:00 hp-plotter01 bladebit_cuda[4991]: Serializing P7 entries
Feb 11 21:34:01 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Command buffer full. Waiting for commands.
Feb 11 21:34:10 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Waited 8.672000 seconds for a Command to be available.
Feb 11 21:34:12 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Command buffer full. Waiting for commands.
Feb 11 21:34:21 hp-plotter01 bladebit_cuda[4991]: [PlotWriter] Waited 8.496000 seconds for a Command to be available.
Feb 11 21:34:21 hp-plotter01 bladebit_cuda[4991]: Completed serializing P7 entries in 21.42 seconds.
Feb 11 21:34:21 hp-plotter01 bladebit_cuda[4991]: Completed Phase 3 in 113.14 seconds
Feb 11 21:34:21 hp-plotter01 bladebit_cuda[4991]: Completed Plot 1 in 213.38 seconds ( 3.56 minutes )
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: *** Panic!!! *** Fatal Error:
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: Failed to write to plot with error 112:
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /usr/bin/bladebit_cuda(+0xce8fd)[0x55bc4511e8fd]
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /usr/bin/bladebit_cuda(+0xce0cf)[0x55bc4511e0cf]
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /usr/bin/bladebit_cuda(+0xbcb5d)[0x55bc4510cb5d]
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /usr/bin/bladebit_cuda(+0xbd4c0)[0x55bc4510d4c0]
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /usr/bin/bladebit_cuda(+0xcf676)[0x55bc4511f676]
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f93a86d7609]
Feb 11 21:35:19 hp-plotter01 bladebit_cuda[4991]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f93a8291133]
Feb 11 21:36:06 hp-plotter01 systemd[1]: bladebit.service: Main process exited, code=exited, status=1/FAILURE
Feb 11 21:36:06 hp-plotter01 systemd[1]: bladebit.service: Failed with result 'exit-code'.

mmitech avatar Feb 11 '23 20:02 mmitech

I'm having the same "Command Buffer Full" error. I just killed some processes that really weren't taking much resources, but I'm hoping this will allow me to plot on through. My plot finished, but it took about 10 minutes.

ShitcoinSolutions avatar Feb 11 '23 20:02 ShitcoinSolutions

Might be worth checking out Arch Linux or Clear Linux as those are fairly stripped down distros. I'm looking at arch for my farmer, but haven't made the switch yet.

ArigornStrider avatar Feb 11 '23 21:02 ArigornStrider

hey @ArigornStrider it's FlipThisCrypto from Discord. I thought I changed my old name on here lol

ShitcoinSolutions avatar Feb 11 '23 21:02 ShitcoinSolutions