mmx-node icon indicating copy to clipboard operation
mmx-node copied to clipboard

[Node] WARN: VDF verification failed with: invalid output at segment 0

Open Ealrann opened this issue 2 years ago • 18 comments

Hello, On a synced node, I'm getting a recurrent warning (few time per second): [Node] WARN: VDF verification failed with: invalid output at segment 0

The CPU is an Intel i7-8700, with an iGPU UHD 630. I'm on Archlinux, opencl seems working:

# mmx node info
Synced: Yes
Height: 56094
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107
[Node] WARN: VDF verification failed with: invalid output at segment 0
[TimeLord] INFO: 3.48753 M/s iterations
[Node] INFO: Waiting on VDF for height 56103
[Node] INFO: Waiting on VDF for height 56104
[Node] INFO: Waiting on VDF for height 56105
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107
[Node] WARN: VDF verification failed with: invalid output at segment 0
[Node] INFO: Waiting on VDF for height 56103
[Node] INFO: Waiting on VDF for height 56104
[Node] INFO: Waiting on VDF for height 56105
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107
[Node] WARN: VDF verification failed with: invalid output at segment 0
[Node] INFO: Waiting on VDF for height 56103
[Node] INFO: Waiting on VDF for height 56104
[Node] INFO: Waiting on VDF for height 56105
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107
[Node] WARN: VDF verification failed with: invalid output at segment 0
[Node] INFO: Waiting on VDF for height 56108
[Node] INFO: Waiting on VDF for height 56103
[Node] INFO: Waiting on VDF for height 56104
[Node] INFO: Waiting on VDF for height 56105
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107
[Node] WARN: VDF verification failed with: invalid output at segment 0
[Node] INFO: Waiting on VDF for height 56103
[Node] INFO: Waiting on VDF for height 56104
[Node] INFO: Waiting on VDF for height 56105
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107
[Node] WARN: VDF verification failed with: invalid output at segment 0
[Node] INFO: Waiting on VDF for height 56103
[Node] INFO: Waiting on VDF for height 56104
[Node] INFO: Waiting on VDF for height 56105
[Node] INFO: Waiting on VDF for height 56106
[Node] INFO: Waiting on VDF for height 56107

Edit1

It looks like the problem only appears when using the new intel driver (intel-compute-runtime, which is the main driver on Archlinux). When I switch to the old driver ( intel-opencl on Archlinux), the VDF is now correctly verified.

Ealrann avatar Dec 28 '21 17:12 Ealrann

Looks like the VDF is not getting verified, below some of my console output.

"intel_gpu_top reports some usage" - Do you see serious ramping up the MHz and Watts in top line every ~10 seconds? That's the VDF heartbeat you get the Warnings for, output should be [Node] INFO: Verified VDF at 1508314453000 iterations, delta = 10.2222 sec, took 3.11575 sec

Something not ok with OpenCL I guess, don't know Arch so can't help much.....

(I'm using a Intel NUC with Pentium J5005 and Intel® UHD Graphics 605)

[Node] INFO: Verified VDF at 1508281144000 iterations, delta = 9.82702 sec, took 3.10414 sec
[Node] INFO: Finalized height 56673 with: ntx = 0, score = 465, k = 30, tdiff = 33353, sdiff = 36155
[Node] INFO: New peak at height 56679 with score 182
[Harvester] INFO: 4 plots were eligible for height 56682, best score was 9012, took 0.074925 sec
[TimeLord] INFO: 1.34195 M/s iterations
[TimeLord] INFO: Restarted VDF at 1508281144000
[Node] INFO: Verified VDF at 1508314453000 iterations, delta = 10.2222 sec, took 3.11575 sec
[Node] INFO: Finalized height 56674 with: ntx = 0, score = 88, k = 32, tdiff = 33383, sdiff = 36165
[Node] INFO: New peak at height 56680 with score 80
[Harvester] INFO: 2 plots were eligible for height 56683, best score was 2995, took 0.046341 sec
[TimeLord] INFO: 1.33232 M/s iterations
[Node] INFO: Verified VDF at 1508347762000 iterations, delta = 10.1099 sec, took 3.11653 sec
[Node] INFO: Finalized height 56675 with: ntx = 0, score = 27, k = 30, tdiff = 33411, sdiff = 36193
[Node] INFO: New peak at height 56681 with score 83
[Harvester] INFO: 3 plots were eligible for height 56684, best score was 3919, took 0.057209 sec
[TimeLord] INFO: 1.33369 M/s iterations
[Node] INFO: Verified VDF at 1508381071000 iterations, delta = 10.0047 sec, took 3.10998 sec
[Node] INFO: Finalized height 56676 with: ntx = 0, score = 60, k = 32, tdiff = 33382, sdiff = 36211

xkredr avatar Dec 28 '21 19:12 xkredr

"intel_gpu_top reports some usage" - Do you see serious ramping up the MHz and Watts in top line every ~10 seconds?

Yes, it bumps from 0 to ~60% usage every 10 seconds. If I stop the node, the iGPU is never used (and no screen is plugged, headless server).

It's rather easy to use intel openCL with archlinux, the new intel driver for openCL intel-compute-runtime is available on arch.

Ealrann avatar Dec 28 '21 19:12 Ealrann

Well, this [Node] WARN: VDF verification failed with: invalid output at segment 0 is one for Max I guess;-)

xkredr avatar Dec 28 '21 20:12 xkredr

Ok, I replaced intel-compute-runtime by the older driver (intel-opencl), the warning disapeared and the VDF are now corectly verified. So I guess there is something wrong with the intel-compute-runtime?

Ealrann avatar Dec 28 '21 20:12 Ealrann

Well, this [Node] WARN: VDF verification failed with: invalid output at segment 0 is one for Max I guess;-)

Looks like the OpenCL computation on the device is not correct...

madMAx43v3r avatar Dec 28 '21 20:12 madMAx43v3r

So I guess there is something wrong with the intel-compute-runtime?

yes

madMAx43v3r avatar Dec 28 '21 20:12 madMAx43v3r

@Ealrann Try to install latest version from https://github.com/intel/compute-runtime/releases/latest The same CPU work fine for me on Ubuntu 21.10 only with the latest version of opencl.

gorenstein avatar Dec 28 '21 20:12 gorenstein

So I guess there is something wrong with the intel-compute-runtime?

yes

@madMAx43v3r I just ran an opencl benchmark (https://github.com/ekondis/mixbench), and it's working. I don't see what I can do more on opencl side since everything shows it's working (except with mmx-node).

After the benchmark, I cloned again the repo, rebuild everything and resync the blochchain, but the error is still here.

Ealrann avatar Dec 28 '21 22:12 Ealrann

@gorenstein Since I managed to run an opencl benchmark, it's unlikely the Intel driver is faulty (and TBH, out of using the package manager, I'm not really confident with manual install of drivers). But I'll give it a try when a new package will be released (which is generally very fast on arch).

Ealrann avatar Dec 28 '21 22:12 Ealrann

@gorenstein Since I managed to run an opencl benchmark, it's unlikely the Intel driver is faulty (and TBH, out of using the package manager, I'm not really confident with manual install of drivers). But I'll give it a try when a new package will be released (which is generally very fast on arch).

hard to say since the code is not the same... if I can reproduce the issue myself I might be able to figure something out

madMAx43v3r avatar Dec 29 '21 09:12 madMAx43v3r

For what it's worth, I get this error also.

ghost avatar Dec 30 '21 19:12 ghost

Try again with latest version.

madMAx43v3r avatar Dec 31 '21 23:12 madMAx43v3r

Ok, I'll try tomorrow. Meanwhile, I used a Radeon RX 560, it worked perfectly.

Ealrann avatar Dec 31 '21 23:12 Ealrann

Ok, I removed the discrete GPU, remove the opencl-amd package, reinstalled the intel-compute-runtime, updated mmx, but unfortunately, the warning still appears. The mixbench demo is still working.

Can I do something to see more debug logs?

Otherwise, I could provide an ssh access to this server if you need?

Ealrann avatar Jan 01 '22 14:01 Ealrann

Ok, looks like I have to make a standalone test tool for OpenCL.

madMAx43v3r avatar Jan 01 '22 22:01 madMAx43v3r

I guess you're right. Since OpenCL is the number one topic of new MMX farmers, it could help. If you need I test anything, let me know. I'll also try again when arch will update intel-compute-runtime.

Ealrann avatar Jan 01 '22 22:01 Ealrann

изображение изображение

This error occurs on 2 of my systems: 2700X / RX580 5950X / RX6600XT OS - Windows 11, latest official stable drivers.

OpenCL is used correctly: 2022-01-02 05:57:10 [Node] INFO: Using OpenCL GPU device: 0

if I try to disable it (by specifying -1 in the parameters), then synchronization becomes completely impossible due to too long response

GMND avatar Jan 02 '22 09:01 GMND

Seems like the same issue yes...

madMAx43v3r avatar Jan 02 '22 18:01 madMAx43v3r