bfgminer icon indicating copy to clipboard operation
bfgminer copied to clipboard

Degrading hashrates over time (bug?)

Open silverdr opened this issue 11 years ago • 18 comments
trafficstars

I noticed that the longer bfgminer runs, the hash rates start to suffer and the stream of constant messages like:

BIF10: Unknown job id: job 74c 5452f7e3 3

becomes the most prominent log output. While I am not sure what this means, I verified several times that the fresh start gives about nominal hash rates for all connected mining devices. Then the rates drop and the log stream of unknown job ids increases its intensity. The remedy is to restart bfgminer.

I am using bfgminer version 4.8.0-34-g9c71295

and 12 USB miners (BIF and AMU) under OSX but I don't expect this to be OS specific

silverdr avatar Oct 31 '14 02:10 silverdr

Sounds like a bug in your BIF firmware. Is it a real bi*fury, or some clone? Have you tried to use any of the firmware-bug workarounds in README.ASIC?

luke-jr avatar Oct 31 '14 03:10 luke-jr

I can't be sure whether they are "real" or some clone. I got them second hand (except the hex one). Quality-wise they look like a good product. Manufacturer listed is c-scape but this probably doesn't mean much in terms of who really produced them. And there is also AMU device too. Anyway - thank you for the suggestion. I tried now to set the number of chips for hex_fury by its serial number and used the --set bifury:free_after_job=no I take the last one to be global and affect all other bi_fury too, right? Shall see how it performs over time.

silverdr avatar Oct 31 '14 12:10 silverdr

--set bifury:* is global. If you have serial numbers, you can use --set bifury@serial:* instead

luke-jr avatar Oct 31 '14 13:10 luke-jr

I used

$ bfgminer --set bifury@serialofthehexfury:chips=6 --set bifury:free_after_job=no

Shall check the results after a day or two. Regards.

silverdr avatar Oct 31 '14 13:10 silverdr

Unfortunately it didn't help. I still tried one more approach and listed all devices (including the dual chip ones) by serial numbers, together with their appropriate :chips=2 entries. That didn't help either. As always, quitting and restarting bfgminer helps for some time. The last think I didn't try is to set also the "free_after_job=no" individually for each device but this shouldn't be needed if it is set globally, should it?

silverdr avatar Nov 02 '14 15:11 silverdr

If you set it globally, doing it per-serial shouldn't matter. A debug log would show this.

luke-jr avatar Nov 02 '14 15:11 luke-jr

My current config looks like this:

{
"pools" : [
]
,
"api-mcast-port" : "4028",
"api-port" : "4028",
"expiry" : "120",
"expiry-lp" : "3600",
"failover-switch-delay" : "300",
"log" : "20",
"no-pool-disable" : true,
"no-client-reconnect" : true,
"no-show-processors" : true,
"no-show-procs" : true,
"queue" : "1",
"quiet-work-updates" : true,
"quiet-work-update" : true,
"scan-time" : "60",
"skip-security-checks" : "0",
"submit-stale" : true,
"temp-hysteresis" : "3",
"shares" : 0,
"kernel-path" : "/usr/local/share/bfgminer",
"scan" : [
        "bifury:all",
        "antminer:all"
],
"set-device" : [
        "bifury:free_after_job=no",
        "bifury@0a00301bfcb77fc:chips=6",
        "bifury@12120f1e025416e:chips=2",
        "bifury@1313fd3702542a4:chips=2",
        "bifury@13131b230254250:chips=2",
        "bifury@12122029025409a:chips=2",
        "bifury@1414231c02544ed:chips=2",
        "bifury@1313f02202542fd:chips=2",
        "bifury@181829210254a12:chips=2",
        "bifury@141428300254418:chips=2",
        "bifury@1313ec0e025437c:chips=2",
        "bifury@14142511025455b:chips=2"
]
}

Is there anything suspicious in there? Anything else I could try or anything I could do to help tracking the problem down?

silverdr avatar Nov 02 '14 16:11 silverdr

Another option you may be able to play with is --set bifury:max_queued=N - it defaults to (chips * 5) + 6 Also note osc6_bits for clock speeds.

luke-jr avatar Nov 02 '14 16:11 luke-jr

AFAIR there was a mechanism for automatically increase the queue if needed. Having in mind that the problem shows after some time, this might in fact be somehow related to the queue length. Do you mean limiting the max to - say - chips*2 for example?

silverdr avatar Nov 02 '14 16:11 silverdr

Setting it to chips or chips_2 gives 0 hash rate and 100% HW errors. Trying the other way around now (chips_10).

silverdr avatar Nov 02 '14 16:11 silverdr

Yeah, I'd expect that... if you only keep 5 work items (max_queued), and the device is using 16, you're going to be missing most :)

luke-jr avatar Nov 02 '14 16:11 luke-jr

:-)) I see.. well, I didn't check the exact meaning. I kind of thought that "max_queued" is what's on top of what the device is actually working on already. Like a FIFO buffer to shield from network latency. That's why I thought it might be connected with the queue length increasing mechanism. But admittedly I don't know much (and close to nothing when compared to you, I guess ;-) about bfgminer's inner workings. Now checking with the chips*10. For results shall have to wait another day or two it seems.

silverdr avatar Nov 02 '14 17:11 silverdr

Setting to chips_10 didn't help much. Setting it to chips_20 helped in a way that flow of "Unknown job id" messages does not appear anymore. But the hash rate drops regardless of that:

BIF 0: 53.9C | 4.26/ 4.48/ 4.48Gh/s | A:0 R:0+0(none) HW: 208488/ 13% BIF 1: 45.5C | 4.05/ 4.18/ 4.18Gh/s | A:0 R:0+0(none) HW: 247516/ 15% BIF 2: 48.5C | 4.60/ 4.54/ 4.54Gh/s | A:0 R:0+0(none) HW: 259823/ 15% BIF 3: 51.0C | 4.09/ 4.33/ 4.33Gh/s | A:0 R:0+0(none) HW: 183601/ 12% BIF 4: 46.4C | 3.47/ 4.49/ 4.49Gh/s | A:0 R:0+0(none) HW: 299093/ 17% BIF 5: 47.9C | 3.94/ 4.51/ 4.51Gh/s | A:0 R:0+0(none) HW: 258710/ 15% BIF 6: 48.4C | 3.47/ 4.47/ 4.47Gh/s | A:0 R:0+0(none) HW: 234319/ 14% BIF 7: 44.6C | 4.23/ 4.49/ 4.49Gh/s | A:0 R:0+0(none) HW: 220481/ 13% BIF 8: 53.0C | 4.00/ 4.13/ 4.13Gh/s | A:0 R:0+0(none) HW: 403825/ 23% BIF 9: 47.6C | 4.57/ 4.28/ 4.28Gh/s | A:0 R:0+0(none) HW: 311417/ 18%

Do you have any more ideas about what could be causing this?

silverdr avatar Dec 05 '14 01:12 silverdr

Is your hardware error rate usually ~15%?

luke-jr avatar Dec 05 '14 06:12 luke-jr

No. At the beginning it is about one to three percent. Sometimes a bit more but no too much. Then this HW percentage starts growing and the hash rate drops accordingly.

silverdr avatar Dec 08 '14 23:12 silverdr

Can you upload a debug log somewhere?

luke-jr avatar Dec 08 '14 23:12 luke-jr

Sure. I am not at that machine now but shall have a look next time there.

silverdr avatar Dec 09 '14 00:12 silverdr

I have instead 100% hw errors and the bifury (Red) totally not working as Hashrate. bfgminer 5.4.2 mac. I am sad, never seen it working. Anyone has ideas ?

tieko avatar Dec 26 '17 19:12 tieko