graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

Integrate new PoI hasher

Open That3Percent opened this issue 3 years ago • 12 comments
trafficstars

stable-hash, the library implementing the hashing algorithm for PoI, has a PR open for a new variant that is more than 100x as fast as the current variant, while remaining very strong against "accidental collision". It is not cryptographic strength, but the chances of collision for a non-attacking subgraph are practically zero.

We should adopt this new variant in graph-node as a part of our initiative to speed up indexing. Right now, the PoI takes < 5% of the time to index a typical subgraph, but the expectation is that when the Firehose is integrated, the PoI would take up a much more significant portion of the overall time without this change.

That3Percent avatar Feb 05 '22 17:02 That3Percent

The latest stable-hash is now merged and published to crates.io! Thanks @lutter for your thorough review.

Some things that will need to happen to get this over the finish line:

  • In addition to importing the legacy GitHub commit for the dependency, add 0.4.0 from crates.io (and rename the old to stable_hash_legacy)
  • Implement the new StableHash trait for all types required by the PoI in graph-node, using the reference examples from stable-hash for the best performance
  • Implement proof_of_indexing/online using the new API
  • Ideally re-implement the reference implementation under proof_of_indexing/reference to ensure correctness of the online implementation
  • Have an enum that switches between the old/new implementation as appropriate
  • Add an option to the subgraph manifest to enable the new implementation
  • Maybe for the finalization step that mixes in the indexer address use some cryptographic strength method, like keccak(indexer | hasher.finalize()) or something.

Feel free to ping me for questions. I can do the review.

That3Percent avatar Mar 23 '22 02:03 That3Percent

@That3Percent as we discussed we need a fix and then a new tag on crates.io. For reference could you also add the examples you refer to in the last comment? Thanks 🙏

mangas avatar Mar 29 '22 08:03 mangas

@mangas Version 0.4.1 is published to crates.io.

This patch:

  • Fixes the intermittent compiler error you were seeing
  • Supports big endian architectures
  • Removes all unsafe
  • Updates to Rust 2021 edition

That3Percent avatar Mar 29 '22 13:03 That3Percent

@mangas

Here are the best examples for implementing the StableHash trait: enum, struct

That3Percent avatar Mar 29 '22 13:03 That3Percent

Marking this has blocked, waiting for PR Review, and the new Implementation by @That3Percent

mangas avatar May 04 '22 15:05 mangas

@mangas Unblocked. Did a PR review and responded to issue in Slack.

That3Percent avatar May 04 '22 21:05 That3Percent

As discussed, ticket is with zac for the implementation of the new BlockEventStream enum and then I'll take over to do the algo selection based on version (discussion pending with @azf20 )

mangas avatar May 05 '22 15:05 mangas

@That3Percent - any updates here?

dizzyd avatar Jun 20 '22 15:06 dizzyd

@dizzyd Yep. This got deferred for graph-day but I took it back up soon after. There's a WIP that was posted Friday here: https://github.com/graphprotocol/graph-node/pull/3678 All that is left on my end is self-review, which will probably find a few minor TODOs. I can probably finish by EOD or early tomorrow.

Once that PR merges someone still needs to:

  • Enable the new PoI via a feature flag in the subgraph manifest (using a "rollup" version, I'm told)
  • Test

That3Percent avatar Jun 20 '22 15:06 That3Percent

@dizzyd @mangas: The PR is ready for review

That3Percent avatar Jun 20 '22 17:06 That3Percent

@mangas: You can find the places that need to change to switch on the new algorithm by searching for a0a79c0f-919f-4d97-a82 a-439a1ff78230. There are only two instances.

That3Percent avatar Jun 20 '22 17:06 That3Percent

Update: Still waiting on https://github.com/graphprotocol/graph-node/pull/3678#issuecomment-1176539459 to be merged.

Since the API is now defined, we can prolly start working on the next step of enabling the new API for a new manifest version.

mangas avatar Jul 12 '22 09:07 mangas

@azf20 @mangas Let's hold off on merging this. I'd like to make a small improvement which may unlock some protocol improvements.

I would like to have the finalization step include the total subgraph gas spent in the hash. This would enable us to substantiate claims about subgraph indexing costs in a slashable way. This might factor into Curation V2 or other systems.

That3Percent avatar Nov 04 '22 04:11 That3Percent

@azf20 @mangas I previously advised holding off merging, but I don't think there is any stability around what changes we may like to make, and there are perhaps other changes we could make in the future or maybe evolve this later to address those concerns. So, there is probably not a good reason to delay progress. Sorry about that - let's get these changes integrated.

That3Percent avatar Jan 23 '23 21:01 That3Percent

Per the above comments and #3678 this looks to be done, please re-open @That3Percent @mangas if there is still work to do here

azf20 avatar Mar 04 '23 00:03 azf20