stratum-mining-litecoin icon indicating copy to clipboard operation
stratum-mining-litecoin copied to clipboard

INFO template_registry # Job id 'fd1' not found

Open ghost opened this issue 11 years ago • 11 comments

Hi

I just started using stratum (upgraded my pool from pushpoold), and I have been very happy with it. It is an amazing product for sure. I had run into an issue today when a user came to the litecoin pool with over 30MHs. Most of their shares were rejected and this error kept showing up in my stratum.log file.

2013-12-02 15:16:05,212 INFO template_registry # Job id 'fd1' not found 2013-12-02 15:16:05,212 INFO interfaces # False (0) INVALID user.1

any idea why this happened, and what i can do to fix it so they can mine again?

Thanks

ghost avatar Dec 03 '13 00:12 ghost

I am experiencing this issue as well. Anyone have any ideas?

CaptainAK avatar Dec 07 '13 04:12 CaptainAK

Mining with SHA and not scrypt?

TheSerapher avatar Dec 07 '13 12:12 TheSerapher

I wish it was that easy :) No I have watched the logs and the same worker will submit many valid shares and then suddenly start submitting shares where we get the "Job id not found" error. And it is not just the fast workers, that have the issue. It seems to be pool wide when the pool hashrate is high and even more so when there are several users with very high individual hashrate.

It appears to be more of a issues where Stratum is not notifying workers of a block/job change quickly enough, and the workers are turning in work for old (no longer existent) jobs. Could this be solved by speeding up the merkle refresh rate or some other tuning?

CaptainAK avatar Dec 07 '13 13:12 CaptainAK

I've seen this as well in the last few weeks, weird as hell. Doesn't make any sense, seems at best random, different pools at different times, but consistent. I've had more odd stuff happen in the last few weeks than I care to recall, something isn't right.

obigal avatar Dec 08 '13 03:12 obigal

i have several occurencis of this "job not found problem" in my pool does anyone have a fix to this?

scratchy avatar Dec 09 '13 12:12 scratchy

i reckon job based work would solve this but no ones taken up the 70 LTC bounty to fix it yet

ahmedbodi avatar Dec 09 '13 13:12 ahmedbodi

I didn't know there was a bounty, now I'm intrigued. What exactly are we trying to fix?

' Job id not found ' is a normal occurrence on new blocks, there is no fixing that. Period.

In my situation, apparently there was never a new block notification received by my miner (cgminer) and it continues to hash old work, the pool obviously knows about the change because it is rejecting the shares since it has already moved on to the next block. I use the block notify script with PREVHASH_REFRESH_INTERVAL = 60 What I have been trying to figure out is where the fault is.

  1. Did the daemon even try to notify stratum?
  2. Did the daemon send the notification and stratum never received it?
  3. Did stratum receive the notification but choked on it and never broadcasted update?
  4. Did stratum received notification and then broadcast new block but it never made it to the miner?

@ahmedbodi when you say job based what exactly do you mean? Stratum is job based already so I'm a little confused.

obigal avatar Dec 09 '13 19:12 obigal

its not its block based atm. join the #unitedminers-2 channel on freenode and everyone will explain its easier that way

ahmedbodi avatar Dec 09 '13 19:12 ahmedbodi

I have the same problem, stratum was running for hours without problems, but now a lot Job id 'XX' not found messages appear in my log.

Also Exceptions like this started to appear: 2013-12-11 06:23:11,408 ERROR protocol # Processing of message failed Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py", line 185, in dataReceived self.lineReceived(line, request_counter) File "/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py", line 216, in lineReceived raise custom_exceptions.ProtocolException("Cannot decode message '%s'" % line) ProtocolException: Cannot decode message ''

Fredyy90 avatar Dec 11 '13 05:12 Fredyy90

Same problem here. It always starts with this exception:

2013-12-13 10:45:28,466 INFO block_updater # Checking for new block.
2013-12-13 10:45:28,473 INFO block_updater # New block! Prevhash: b5c70ae1f142b11b7d2e8a15b1dbdf7b85e304d705465629b37a8c99d4375ab2
2013-12-13 10:45:28,481 INFO template_registry # New template for b5c70ae1f142b11b7d2e8a15b1dbdf7b85e304d705465629b37a8c99d4375ab2
2013-12-13 10:45:28,486 ERROR template_registry # [Failure instance: Traceback: : 'NoneType' object has no attribute 'write'
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:1095:gotResult
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:1082:_inlineCallbacks
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:362:callback
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:458:_startRunCallbacks
---  ---
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:545:_runCallbacks
/home/---/stratum-mining-cnc/lib/template_registry.py:133:_update_block
/home/---/stratum-mining-cnc/lib/template_registry.py:103:add_template
/home/---/stratum-mining-cnc/mining/subscription.py:26:on_template
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/pubsub.py:54:emit
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/pubsub.py:160:emit
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/pubsub.py:66:emit_single
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py:102:writeJsonRequest
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py:83:transport_write
]

and then goes "Job id not found"

2013-12-13 10:45:28,557 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:28,557 INFO interfaces # False (0) INVALID some_worker
2013-12-13 10:45:28,637 INFO interfaces # 0000008182b4afba63a1ead5c430c0344ee56e6bf8f72fc8dab401987d80f4aa (506) valid some_worker.1
2013-12-13 10:45:28,766 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:28,766 INFO interfaces # False (0) INVALID some_worker.4
2013-12-13 10:45:28,840 INFO interfaces # 0000006ff28b8258966428222a5d74bcc9014a624c3fe5a8c84cd4e805bf7008 (585) valid some_worker.4
2013-12-13 10:45:28,870 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:28,871 INFO interfaces # False (0) INVALID some_worker.4
2013-12-13 10:45:28,996 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:28,996 INFO interfaces # False (0) INVALID some_worker.5
2013-12-13 10:45:29,099 INFO interfaces # 000001be10113b5287df4ba800f88be56efb66eadcb5ac9fbd8e336cd586cb29 (146) valid some_worker.3
2013-12-13 10:45:29,217 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:29,218 INFO interfaces # False (0) INVALID some_worker.1
2013-12-13 10:45:29,229 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:29,229 INFO interfaces # False (0) INVALID some_worker.worker2
2013-12-13 10:45:29,278 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:29,278 INFO interfaces # False (0) INVALID some_worker.30
2013-12-13 10:45:29,304 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:29,304 INFO interfaces # False (0) INVALID some_worker
2013-12-13 10:45:29,327 INFO template_registry # Job id 'e0' not found

It may last for some seconds or more than 10 minutes, with random appearance interval.

Weird thing is that some of the workers are submitting valid shares at the same time. It looks like it is triggered when several blocks are found in a very short time interval, like several seconds.

Stratum restart seems to fix it for some random time. (Just because jobs are recreated)

My assumption is that jobs are being deleted before workers get notified.

Lexazan avatar Dec 13 '13 10:12 Lexazan

Same problem here. Makes my pool somewhat unstable...

helderdb avatar Feb 20 '16 22:02 helderdb