stratum-mining-litecoin icon indicating copy to clipboard operation
stratum-mining-litecoin copied to clipboard

Submit Shares with higher diff than send to client

Open TheSerapher opened this issue 11 years ago • 31 comments

https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/service.py#L90

Reading this line, I would assume the following could happen:

  • New miner connects at POOL_TARGET difficulty
  • Template is used to send out POOL_TARGET difficulty shares
  • Miner is working on POOL_TARGET until RETARGET_TIME is matched
  • While meeting the new TARGET with NEW shares sent to miner, OLD POOL_TARGET shares will be accepted with RETARGET_DIFF difficulty due to the workers session difficulty being used

I am not sure if my way of thinking is correct, but I have seen issues with clients that tend to re-target a lot. Hashrates up to 50 times higher than they should be (5.6 mhash instead of 800 khash). When checking the DB I saw a lot of high diff shares being accepted right after a diff retarget from 16 to 480.

EDIT: Maybe this can be further tested when using a low re-target time? It might show a lot of weird diff shares being added in the DB that might be reported differently in the client?

EDIT2: More findings. Worker difficulties are stored per worker-connection. They are created here: https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/service.py#L49 . Once created, it is also used to access the mentioned workers difficulty and store it inside the session when doing a re-target: https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/basic_share_limiter.py#L148 So unless shares are properly validated against their ORIGINAL difficulty and not the assumed one by the session, I would think this could lead in issues when retargets happen often or re-targets changes are very different (e.g. the mentioned 16 to 480).

EDIT3: Looked at how jobs work, apparently they are not stored on a per-share basis (each share has a stratum stored job-id) but are rather only referencing the block that we wish to find: https://github.com/moopless/stratum-mining-litecoin/blob/master/lib/template_registry.py#L85

These are all assumptions! I have by no means any clue how stratum/templates and all that works together in the grand scheme of things. If anyone is able, and willing to, shed some light on this, that would be much appreciated!

If all these assumptions are correct, I'd think it might be needed to find a way to 'cache' sent out shares and they difficulty and finding a way to re-evaluate their difficulty based on the send share once the client returns the properly calculated results. Otherwise shares could always be off.

Another idea: Slowly ramp up share difficulty for workers. Instead of assuming a diff that might match the share rate we aim for per worker, only increment by a percentage. To allow workers to hit their target share rate faster, we can lower the retarget time. Since we only increase by a certain percentage, even though we retarget more often we would still not submit shares that are completely off. And a good side effect: The higher the diff the higher you would increase the diff with the next retarget. Since higher diff means slower share rates already, the potential risk of shares being off diff would decrease.

TheSerapher avatar Oct 22 '13 20:10 TheSerapher

The problem is that if you adjust difficulty up without allowing the old diff, you get a bunch of shares rejected by the pool for being above the target, because the miners have a local queue of work that they have to get through before they'll start working on the higher difficulty shares.

I looked at this code a while back and guessed that it could potentially be exploitable. Have a look at the generalfault version, I believe it lets you set maximum adjustments.

flound1129 avatar Oct 29 '13 08:10 flound1129

Maximum adjustments are also just a workaround. The only good solution is the way eloipool does it: Send work out, log it as a job, receive work and check what job information is stored for it. This way you won't reject shares and you can check it properly.

TheSerapher avatar Oct 29 '13 08:10 TheSerapher

Here is a little different vardiff order more or less in powers of 2 if you would like to experiment, well actually it just doubles the initial value e.g 16,32,64,128,256,512,1024

It only retargets one step at a time up or down no matter the difference asked for, it could probably be improved, I haven't messed with it for some time now.

925076c34852854c3f9792469c7d97b0dd025f3d

obigal avatar Oct 30 '13 04:10 obigal

So it seems the moopless stratum fork is logging share solely on the difficulty variable stored in session (on a per-worker basis) while a client could still be sending shares at a previous (and lower) difficulty. Edit : wrong statement, please continue reading.. :)

I'm not supposed to be a developper, but next to my internal tests on my pool, I check my users shares diff using this formula which return the share diff from the hash_int :

Algebra tells us the diff_to_target is the same as hash_to_diff share_diff = int(self.diff_to_target(hash_int)) https://github.com/moopless/stratum-mining-litecoin/blob/master/lib/template_registry.py#L245

You can try enabling the display of this var at any time on a testnet pool.

So, what I'm trying to say is : instead of logging in the shares table in the database the session difficulty set by your vardiff engine, just use your vardiff engine with the session variable to manage the broadcasted mining diff to your stratum clients BUT log the share diff using the above formula. As you can easily know the price per share for a difficulty 1 share, it is then easy to fairly pay your miners with that share diff from that formula. I am not using this fork so I don't know if I'm helping or if it would be difficult to log the share difficulty using this formula instead of using the session difficulty reference.

Cheers.

NINJA EDIT : It already is the case, share_diff is used to log the share diff in the database It is returned by the method in template_registry : https://github.com/moopless/stratum-mining-litecoin/blob/master/lib/template_registry.py#L281

So, in service.py it is captured back : https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/service.py#L99 It is rightly used to log healthy shares Interfaces.share_manager.on_submit_share(worker_name, block_header, block_hash, difficulty, submit_time, True, ip, '', share_diff)

BUT still using difficulty (which is session['difficulty']) for bogus shares where an exception was thrown by template_registry : Interfaces.share_manager.on_submit_share(worker_name, False, False, difficulty, submit_time, False, ip, e[0], 0)
https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/service.py#L103

(after thinking twice, it seems hard to use share_diff when an exception was thrown by template_registry : we don't know how and where the template_registry check failed, it leads to an undefined share_diff variable)

That is to say : It may not be related with the issue we are talking about, unless you are calculating the hashrate (or worse, giving a LTC reward) including the bogus shares where template_registry thrown an exception. If I'm right and your CMS is calculating the hashrate from all shares (including the bogus ones), maybe try avoid including those (or don't log these shares at all if you don't need history of bogus shares). If I'm right, seems more like a glitch than an exploit.

Final edit : Remember, generating a bogus share (that won't end up being paid as our_result won't be "Y") doesn't take any effort. Still, this bogus share will be logged with session['difficulty'] and may fuck up with our hashrate calculation functions if you take all the shares from tables in count (seems to be the case here https://github.com/TheSerapher/php-mpos/blob/next/public/include/classes/statistics.class.php#L213) flound1129 (in the next comment) points out a possible DoS (Denial of Service) at stratum / DB layer by logging bogus shares on an unlimited basis, which is this moopless fork doing currently until a commit / pull request is brought to the tree).

M0nsieurChat avatar Oct 30 '13 14:10 M0nsieurChat

The logging of bogus shares (which can be submitted very quickly) also causes a potential for DoS at both the stratum and DB layer, I have added code to my own stratum fork to temporarily ban workers that submit more than a certain number of bogus shares in a minute.

flound1129 avatar Oct 30 '13 16:10 flound1129

Thanks for taking the time to read my comment, flound1129. That's exactly what I said on my final edit (regarding the faisability of sending a large amount of bogus shares in no time). I think logging bogus shares in the database should be related to a param in config.py or the loglevel param. (And it needs to be rate limited if the pool admin enables bogus shares logging)

Waiting for TheSerapher to confirm if the shares with a high diff are bogus ones (no blockhash, no solution). If not, I guess we'll need to continue our work.

M0nsieurChat avatar Oct 30 '13 16:10 M0nsieurChat

The problem is that if you don't log bogus shares then you cannot calculate worker efficiencies, pool efficiency, etc. But yes it could be controlled by a config.py variable.

I think a better solution is to disconnect and temporarily ban workers that submit too many bogus shares.

flound1129 avatar Oct 30 '13 16:10 flound1129

You are totally right, this info is needed for efficiency stats and so on. And this bogus shares logging needs to be rate limited to avoid abuse / DoSing. Putting a table index (to speed up processing) on the our_result (or whatever is called your column to identify a valid share) column to only take the valid shares (using a WHERE myValidShareColumn = 'Y') for calculating the hashrates sounds right to me aswell. Why a bogus share would count to report an hashrate (which is your / the pool VALID hashes generated per second) ?

M0nsieurChat avatar Oct 30 '13 16:10 M0nsieurChat

Bogus shares don't (with most pools I'm aware of) and should not count toward a user's hashrate.

flound1129 avatar Oct 30 '13 16:10 flound1129

Seems that the frontend used by a good bunch of pools is using every shares in the table(s) including the bogus ones to calculate this. https://github.com/TheSerapher/php-mpos/blob/next/public/include/classes/statistics.class.php#L213 I might be wrong, not a PHP/MySQL guru..

M0nsieurChat avatar Oct 30 '13 16:10 M0nsieurChat

It wasn't until vardiff was added. I will fix that :-) Thanks for the heads up on the.

A share rate limiter on invalid shares might make sense. especially if it is a ton in a very short time. I have seen my mine gone nuts two times which I'm the end caused it to crash, but until then a good amount of invalids was already submitted.

Sadly this is not my project nor do I have time to fork and work on it myself. Someone else needs to step up for this project to continue and be maintained :-)

TheSerapher avatar Oct 30 '13 18:10 TheSerapher

I'd be happy to submit my temp banning code if someone can start maintaining this again.

flound1129 avatar Oct 30 '13 18:10 flound1129

With all these findings in this issue, maybe we can say that (after I fixed my pool frontend with above ticket closed) even with some shares being added at really strange diffs or diffs that are off, this might be a non-issue as long as only invalid shares are affected by it?

Even after reading all that, it still seems very fishy that my friend was able to pick up on a 20,000 shares lead in less than a day. From my point of view, it inserted valid shares at wrong diff, otherwise he would not have been able to catch up that fast. I have also heard that this issue was reproduced by other ops and they have seen the same strange behavior.

Hopefully this turns out to be a non issue after all. But until I am sure I'd not recommend turning VARDIFF on.

EDIT:

https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/service.py#L99-L100

The workers own difficulty is indeed sent to the submit_share in template registry:

https://github.com/moopless/stratum-mining-litecoin/blob/master/lib/template_registry.py#L176

The only check done on this custom difficulty is rather simple:

https://github.com/moopless/stratum-mining-litecoin/blob/master/lib/template_registry.py#L233-L234

So MAYBE this is indeed a non-issue, would not explain the weird things that I have been seeing with my friend and valid shares ... Strange.

TheSerapher avatar Oct 31 '13 07:10 TheSerapher

Here a sample after turning VARDIFF on again:

| 302111 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 0000001f4cd5fffb1d957e0b7579293c6d4a260aa8d2d9815b3632ec142ca9cc |        384 | 2013-10-31 07:30:59 |
| 302110 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 0000002c7b7c10983aa34043eab1531fe790b9ac59795bcbbc736fb91b74ba2e |        384 | 2013-10-31 07:30:55 |
| 302109 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 000002c7dd05a2a54c84e4d5ff9e8fc7345371243c2267f7ccb34304c2b19342 |        384 | 2013-10-31 07:30:55 |
| 302108 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 00000180f53583de667bb3b0434ae27472a61f71f2a78dcaf7e265c80c87882f |        384 | 2013-10-31 07:30:53 |
| 302107 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 0000003404a15e182a9db1e8930dfe380c539f2fed6f95e6af576756eb9d1406 |         64 | 2013-10-31 07:30:53 |
[...]
| 302105 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 000003d169b15bc0f51f4a83d284872e52fa1cdb0d11d8eb9f46f37939efd6a2 |         64 | 2013-10-31 07:30:42 |
| 302104 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 0000012137c921e5a2fb478fa8512f2d96a839268a05a38b97930973d4a136fc |         64 | 2013-10-31 07:30:38 |
| 302103 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 00000121e2250c6bd5e24c6b194f223dcad3aab0e9904b4a47f14aeed7aaf34a |         64 | 2013-10-31 07:30:32 |
| 302102 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 0000039b3a36e1a684acc9473e0a55d35c6b1f5860ea5fafda891a998e8401c3 |         64 | 2013-10-31 07:30:28 |
| 302101 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 000002c5e29040c387c6cde73b8785398c946474740b7600273c43f75eea9128 |         64 | 2013-10-31 07:30:19 |
| 302100 | 10.0.2.2 | cptvir2l.amd        | Y          | N               |        | 0000009c7dcccd17bc4953e5ce47748c4ce32347bcdbc1f21cc463f37ca0575a |         64 | 2013-10-31 07:30:15 |

Long line of 64 diffs (POOL_TARGET) followed by high diff shares in very quick succession. One might say that was luck, but I had it happen twice with his re-targets:

| 302081 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000002a988259ca175d71aa774c88b34000410ad6d420df4dae2cc8330cc5c5b |         64 | 2013-10-31 07:28:58 |
| 302077 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000002806e1178568233bf7be4198f2f1e081b55239121272651b581f850ac95 |         64 | 2013-10-31 07:28:48 |
| 302070 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 0000005af7d649b434a1a218ca14de71cf0dd701b64bde97178049d8bc63ba82 |        480 | 2013-10-31 07:27:29 |
| 302068 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 0000005bcbdae576a7e1915135677a1909dab5a18aa589052ae8bffad4369506 |        480 | 2013-10-31 07:27:19 |
| 302065 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 00000047d0e62b9e69e0b60776aeef61290ef971b635aef46009151fadc4413e |        480 | 2013-10-31 07:27:07 |
| 302059 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000000f6fcbda6920952f51628b5119bc7f8e03b6e4443858ed656fb94985dba |         64 | 2013-10-31 07:26:47 |
| 302056 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 0000026e31f9a9d14fbc6ee5d367a7782d73cc9250285d0f25e6262c82f43348 |         64 | 2013-10-31 07:26:36 |
| 302053 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 00000199312b1bfa431efd7c253920483a42426844303ea2ee5b8e330f2a9c99 |         64 | 2013-10-31 07:26:34 |
| 302049 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000000f5d0efca72cccccff7b996efb1b551ac3ef9f3a1f9ef47a39294b9543f |         64 | 2013-10-31 07:26:33 |
| 302047 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 00000202e478aa853f7fc5e80950c2b0dc1f90c14539989a5ff89048e9e4a4d4 |         64 | 2013-10-31 07:26:30 |
| 302045 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 0000002c4e24d6c875a4b96bc983169005b324dd0cf4335494986fcd8e1d1e3c |         64 | 2013-10-31 07:26:28 |
| 302043 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000001f2fbbc8b7436f70a90ad0132c7b258a8e5f696961c5874234384c25986 |         64 | 2013-10-31 07:26:20 |
| 302042 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 0000031a500f40cc1df99d4cfb9220fbbe68efdf28ca15a8881896eab021c5a3 |         64 | 2013-10-31 07:26:16 |
| 302041 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000000bbf4a3e8732136c907d263e5af4c51d8891fffb59af5bb5fac146fbbae |         64 | 2013-10-31 07:26:16 |
| 302037 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000001b016754d261a489a62a81d2a9929cd6473c23914fe01f5d7347ec7e618 |         64 | 2013-10-31 07:26:09 |
| 302035 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000002a1c93caca44e69182e1a58238024a1eba8dbc1a2b94c548141aa859f01 |         64 | 2013-10-31 07:26:01 |
| 302034 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000003348d4b5c903e5e935c590f1e77034a2b72a9e8684c79430021df51f286 |         64 | 2013-10-31 07:25:59 |
| 302030 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000000735a15956f018c6f6ca7181f79762deb6b78e9c8e34303325d2007a726 |         64 | 2013-10-31 07:25:53 |
| 302029 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 0000025105213c5204b8c5e4afc78d77345a95be56a36eca2fad183baf2db30c |         64 | 2013-10-31 07:25:52 |
| 302028 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000000918ddd41783013f51ff8e614e3fd74295048bc3c4e06e885097b8356a5 |         64 | 2013-10-31 07:25:51 |
| 302027 | 10.0.2.2 | cptvir2l.amd | Y          | N               |                      | 000001f7402aef0bdda580a0bb2324f5dcbdb1c4ad89ec6a4ea8aeef9d68e211 |         64 | 2013-10-31 07:25:48 |

I cannot deny this seems strange to me. Maybe I have been barking up the wrong tree here, but I know that he isn't faking any shares here (since he wouldn't even know how :P)

TheSerapher avatar Oct 31 '13 07:10 TheSerapher

Also saw one of those, can those happen randomly? He didn't change anything on his side that would cause shares higher target:

| 302173 | 10.0.2.2 | cptvir2l.amd        | N          | N               | Share is above target | 0                                                                |        480 | 2013-10-31 07:37:21 |```

TheSerapher avatar Oct 31 '13 07:10 TheSerapher

And here a screenshots of the past 10m hashrates. It works fine with VARDIFF disabled or shares being at 64 diff, but as soon as I enabled VARDIFF his hashrate doubles. Worst I have seen was 5 MHash on valid shares!

screen shot 2013-10-31 at 8 47 59 am

TheSerapher avatar Oct 31 '13 07:10 TheSerapher

I think a quite good solution would be to implement obigal's changes (I actually did it and it works very well)

https://github.com/moopless/stratum-mining-litecoin/commit/925076c34852854c3f9792469c7d97b0dd025f3d

and then we have gradual diff raise. Next step would be to implement initial diff for pool worker based on last value from pool_worker table - this should minimize the risk of simple exploit and we should still benefit from using vardiff.

feeleep75 avatar Oct 31 '13 10:10 feeleep75

NEWS https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/service.py#L109 There is both difficulty (which is session['difficulty'] and share_diff (which is true share diff at time of submit) passed on the method on_submit_share.

The method import_shares (which logs shares in the database) uses the difficulty variable to log the share diff, not share_diff : https://github.com/moopless/stratum-mining-litecoin/blob/master/mining/DB_Mysql_Vardiff.py#L57 v[10](which is share_diff) is never used is that function, however it is a known value because it is passed as an argument.

So, whatever returns template_registry regarding the true share diff (not the session['difficulty']), the share is logged with session['difficulty']. For me, it explains everything.

I can see why moopless didn't put v[10] to log the share diff. The disadvantage of putting v[10](which is share_diff fired up by template_registry) is that you'll end up with logged shares with a different difficulty each time they are logged. Like, for a worker with a stratum diff @16 you'll end up logging shares with a share_diff ABOVE 16 (not less). It could be 16, 17, 18, 400.... So you can't let it go as is with a moving difficulty. If a worker solves a block, you'll end up with a share logged with a fucking huge difficulty which would eat all your block reward (nothing left for other workers)

For me, the problem regarding your wrong share diff is right there. But I can't find any solution to log the closest stratum diff.

M0nsieurChat avatar Oct 31 '13 11:10 M0nsieurChat

@M0nsieurChat last comment

there is a solution to partially solving the problem - we can log into shares table lower value from share_diff and worker_diff

feeleep75 avatar Oct 31 '13 11:10 feeleep75

I think the only real solution is logging work sent as a job and re-insert finished work with the information we stored in the job.

I will see about @obigal solution. Have turned vardiff off since I was seeing the weirdness again.

TheSerapher avatar Oct 31 '13 11:10 TheSerapher

As discussed with dcwsco ive made my own stratum branch based on moopless for working with various coins. If anyone wishes to push a fix there the repo link is : http://bitbucket.com/ahmedbodi/stratum-mining

TheSerapher if u can confirm Obigal's fix works well ill add the code in later tonight to all branches

ahmedbodi avatar Oct 31 '13 12:10 ahmedbodi

Here is the stratum-mining fix: https://github.com/Neozonz/stratum-mining-litecoin

I'll continue to maintain this code in moopless's absence any and all help would be appreciated

Neozonz avatar Oct 31 '13 14:10 Neozonz

ill help where i can but i wont be focused on it since im mainly going to be running SHA pools so will be working on the SHA branches on my code

ahmedbodi avatar Oct 31 '13 15:10 ahmedbodi

s/fix/workaround/ but in the end you move the issue from the pool ops to the pool users. Won't go into details since we discussed this on IRC already, just posting here to ensure people know they still have to wait for a real fix :)

TheSerapher avatar Oct 31 '13 15:10 TheSerapher

We could limit this even further until we have a permanent fix by maybe just keep adding pool target instead of going by doubles were we have lot bigger jumps between retargets as difficulty climbs

obigal avatar Oct 31 '13 16:10 obigal

Maybe someone will understand this... Didn't test, didn't try yet. If you're willing to, this is not the only file to update. There is service.py as well (to fetch back shareAtOldDiff and do something different if it happened).

I DON'T KNOW IF IT WORKS. THE LOGIC IS HERE THOUGH.

https://github.com/M0nsieurChat/stratum-mining-litecoin/blob/patch-1/lib/template_registry.py#L235

edit : fork with patch-1 branch works for my testnet pool (accepting shares && blocks), need to test it against this issue now..

M0nsieurChat avatar Oct 31 '13 22:10 M0nsieurChat

Keeping it simple we can filter probably better than 95% of bogus shares without changing too much, this will add a delay 9b9f440542f34c17ee030b8435cc378d5a6bd536 before we start using the new requested vardiff. It's not a perfect solution but it works upwards and downwards with requested difficulty so much like mining itself it should average out over time.

obigal avatar Nov 03 '13 20:11 obigal

A little update 14c5be169b8a25a8a75630372c68178db8750326 looks like doubling the difficulty (x2) is a little more predictable time wise, my results look like somewhere in the 25-30 second time range works pretty well, it's still depends on how many shares a user might have queued, and of course luck.

obigal avatar Nov 03 '13 23:11 obigal

Curious if anybody has try using the target delay and their results if they have, I took a look at they way eloipool does it and correct me if I'm wrong but it doesn't look like it's doing anything different, all users receive the same job and when a share comes in there is nothing specific that ties it to requested difficulty other than the job id which is the same for every user.

I also have added some temporary worker banning code to my branch if anybody is interested.

obigal avatar Nov 06 '13 13:11 obigal

Thank you.

Still waiting for someone to review this patch https://github.com/M0nsieurChat/stratum-mining-litecoin/tree/patch-1 Quickly tried it on my testnet pool. Shares are no longer being accepted as higher diff but logged with previous diff.

M0nsieurChat avatar Nov 06 '13 13:11 M0nsieurChat