Workload never starts (FahCore returned: INTERRUPTED (102 = 0x66))
Your issue may already be reported! Please search on the issue tracker before creating one.
Your Environment
- F@H Software version: 7.6.21 and 7.4.4
- Operating System: Debian 9
- Browser: N/A (FAHControl)
Expected Behavior
The work queue progresses and the core gets started
Current Behavior
The core crashes(?) and never starts, the system keeps returning "FahCore returned: INTERRUPTED (102 = 0x66)" System doesn't drop the work and gets stuck on waiting for the Core to start, retrying this every minute.
09:26:11:WU01:FS00:Starting
09:26:11:WU01:FS00:Removing old file 'work/01/logfile_01-20201130-085702.txt'
09:26:11:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 14344 -checkpoint 15 -np 8
09:26:11:WU01:FS00:Started FahCore on PID 14414
09:26:11:WU01:FS00:Core PID:14418
09:26:11:WU01:FS00:FahCore 0xa8 started
09:26:12:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
It used to work though, but I think this may have been FahCore_a7 or a different work/project.
I have two Linux machines that are stuck on this project and PRCG. (Project: 16926, PRCG: 16926(78, 786, 4) & PRCG: 16926(29, 636, 5)
Steps To Reproduce
- Install FAH, both version 7.4.4 or the latest 7.6.21 will do
- Have a CPU Folding slot
- At some point it will stop working, but I'm unsure if this is due to this specific workload or the use of a newer FahCore
Context
Due to this issue the system is now idle, "wasting" CPU cycles. Also at the moment the system partially heats my room, so I'm colder :)
Possible duplicate of #1570 -- However my issue is about CPU folding
Dropped the workload and received a new workload from project 16926 that is working without issues. Still on FahCore a8, no idea why the old workload had problems.. I think its likely it will come back at some point.
EDIT: As expected, got another workload with the same issue. PRCG 16926 (59, 832, 1)
Hiya @smiba
Not sure what to make of it since your CPU does have 8 CPUs so in theory, the Project should work fine. Since it only happens on a single Project, I will ask around and see what happens 😄