atree icon indicating copy to clipboard operation
atree copied to clipboard

Create optional map key index to optimize map key iteration 🆕

Open fxamacker opened this issue 1 year ago • 0 comments

Atree OrderedMap stores element's key and value together to reduce number of touched slabs for write ops.

However, this can be inefficient for map key iteration since both keys and values are loaded when only keys are needed.

For very large maps, interaction limits can be reached before iteration is completed. See issues:

  • https://github.com/onflow/cadence/issues/3577
  • https://github.com/onflow/cadence/issues/3544

Suggestion

As mentioned in today's Cadence Implementation Sync, I think we can resolve this by creating an optional keys-only register (map key index) for atree maps when required thresholds/conditions are satisfied.

One approach is for Atree to provide callback functions to determine if index should be created or deleted. If the caller (e.g. Cadence) provides different callback functions then those will be used instead of defaults.

We can:

  • Create index only for large atree maps.
  • Store index in its own register(s) by using atree array or map under the hood for scalability.
  • Map key iteration would use the index if it exists.
  • Deploy this feature using HCU (no spork).

Maintaining index adds overhead for both storage size and speed. Need to try to reduce this overhead.

TODO:

  • [ ] @j1010001 will find out if this is needed in Q4 2024 or later
  • [ ] @fxamacker POC
  • [ ] Look into reasonable thresholds to create map key index by examining testnet and mainnet states.
  • [ ] Understand its impact on storage size, performance, etc.

fxamacker avatar Sep 23 '24 22:09 fxamacker