router icon indicating copy to clipboard operation
router copied to clipboard

chore(federation): borrow selection keys

Open goto-bus-stop opened this issue 1 year ago • 5 comments

This PR changes the implementation of SelectionMap so it does not have owned key data.

I'm opening this already so I can share it but I would like to improve the PR text a bit after our standup 😛 I don't think we'll have time to review this in the next days anyways!

Previously, to use SelectionMap, each selection type provided a key() -> SelectionKey method, returning some owned data, in practice two Arcs. Keys are copied frequently. However, key data must always already exist in the selection itself, so this is quite wasteful.

Selection keys now borrow from their Selection. They are Copy, so there is no overhead[^1] in creating or dropping them. They are meant to be created whenever they're necessary, not stored.

Because keys are now "free" to create, there is no need to cache them in Field, FragmentSpread, and InlineFragment structures. As a result, there is now also no need for the Field/FieldData split. The Field/FieldData data structures existed to disallow arbitrary mutable access to FieldData that may affect the cached key in the Field wrapper. This limitation is lifted, as keys are now only used by SelectionMap: you can freely mutate a Field or a FieldSelection, as long as it is not in a SelectionMap. SelectionMap only hands out mutable access through FieldSelectionValue, which enforces the mutability constraints.

I found this to improve performance across the board by 1-3%. That's not super impressive, but not bad either for something that affects every plan. I think this is also a stepping stone towards having mutable selections/selection sets so we won't need to copy so much.

Review guidance

There's a lot of code changes here. I tried to group the commit history into grokkable chunks. Note I didn't keep it compiling between commits necessarily 🙈

  • Change the SelectionMap implementation to not store SelectionKeys
    • This uses hashbrown's HashTable and a Vec instead of IndexMap. The implementation is similar to IndexMap, except that keys are derived as necessary.
    • The main work is in https://github.com/apollographql/router/pull/6074/commits/42489d1239d037f0757c857901f6c2fa1d16f750
  • Change SelectionKey to be borrowed
    • Remove the iter() and iter_mut() iterators from SelectionMap: https://github.com/apollographql/router/pull/6074/commits/de5ca955b0d08ce7b719f2a688f44b72e3e29b2b
      • iter_mut is not possible to implement with borrowed keys because it would have to yield (SelectionKey<'a>, SelectionValue<'a>), i.e. an immutable and a mutable reference to the same selection. The methods are unnecessary as you can trivially derive the key from the value.
    • The actual migration: https://github.com/apollographql/router/pull/6074/commits/6a8924d57f37fd86ee0f09c84837a313955af317
    • There is an OwnedSelectionKey type that's used in selection set merging, the way that's written requires an owned key structure, that can be adjusted in the future but I decided to keep it like this also because we recently had to revert a change to this code
  • Combining the selection "Data" and wrapper structures:
    • Adjust InlineFragment::can_rebase_on and InlineFragmentData::can_rebase_on so the names won't collide after the merge https://github.com/apollographql/router/pull/6074/commits/68f3ececa73af7dce69150d0094d94c7fd6d0c88
    • Combine Field and FieldData into a single type Field: https://github.com/apollographql/router/pull/6074/commits/4e43243d34d4210aec347ceac0fb3f7adf4390d0
    • InlineFragment: https://github.com/apollographql/router/pull/6074/commits/c2da90c650176fd46eca7a9a99d04219bd3575c2
    • FragmentSpread: https://github.com/apollographql/router/pull/6074/commits/1cf815711fd77c6550c76a2f763350a5aa269a3a

[^1]: I mean, it's still some memory being moved around, but it's not very substantial.

goto-bus-stop avatar Sep 27 '24 14:09 goto-bus-stop

@goto-bus-stop, please consider creating a changeset entry in /.changesets/. These instructions describe the process and tooling.

github-actions[bot] avatar Sep 27 '24 14:09 github-actions[bot]

CI performance tests

  • [ ] connectors-const - Connectors stress test that runs with a constant number of users
  • [x] const - Basic stress test that runs with a constant number of users
  • [ ] demand-control-instrumented - A copy of the step test, but with demand control monitoring and metrics enabled
  • [ ] demand-control-uninstrumented - A copy of the step test, but with demand control monitoring enabled
  • [ ] enhanced-signature - Enhanced signature enabled
  • [ ] events - Stress test for events with a lot of users and deduplication ENABLED
  • [ ] events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • [ ] events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • [ ] events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • [ ] events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • [ ] events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • [ ] extended-reference-mode - Extended reference mode enabled
  • [ ] large-request - Stress test with a 1 MB request payload
  • [ ] no-tracing - Basic stress test, no tracing
  • [ ] reload - Reload test over a long period of time at a constant rate of users
  • [ ] step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • [ ] step-local-metrics - Field stats that are generated from the router rather than FTV1
  • [ ] step-with-prometheus - A copy of the step test with the Prometheus metrics exporter enabled
  • [x] step - Basic stress test that steps up the number of users over time
  • [ ] xlarge-request - Stress test with 10 MB request payload
  • [ ] xxlarge-request - Stress test with 100 MB request payload

router-perf[bot] avatar Sep 27 '24 14:09 router-perf[bot]

I'll update this to use hashbrown::HashTable over hashbrown::raw::RawTable. Drafting until then

goto-bus-stop avatar Oct 02 '24 14:10 goto-bus-stop

✅ Docs Preview Ready

No new or changed pages found.

svc-apollo-docs avatar Oct 14 '24 13:10 svc-apollo-docs

Rebased, tried to rewrite commit history to make a little more sense, will write a new PR description and fix linting before undrafting for review.

goto-bus-stop avatar Oct 14 '24 14:10 goto-bus-stop

Very nice cleanup!

dariuszkuc avatar Oct 28 '24 13:10 dariuszkuc