sui icon indicating copy to clipboard operation
sui copied to clipboard

[Traffic Control] Add dynamic config via admin api

Open williampsmith opened this issue 11 months ago • 3 comments

Description

Implement an admin API endpoint to do a hot reconfiguration of the traffic control policy without clearing blocklists or existing count min sketch.

To facilitate this, the PR introduces a refactor that allows AuthorityState to hold a reference to TrafficController, and for all sui callsites to reference this rather than instantiating separately. This is ok for our purposes (and indeed is much cleaner) because sui-node enforces that validators do not run json rpc server. Additionally, we need to make the policy a member of the traffic controller struct in order to reference it from the main thread.

Things to note: For now, this only works when the node is previously running traffic controller with a blocklist policy. i.e. it will fail if used to enable traffic control on a node that did not already have it enabled. It will also fail if called when the node is running an allowlist policy.

Given the above, there are two intended use patterns:

  1. Before being called, the node is running traffic controller in dry-run mode with some non-allowlist policy config so that tallying and metrics collection occurs. In such a case, it can be called with the appropriate flag to disable dry-run mode. Note that it can also be called to enable dry-run mode where the node previously had it running in active mode.
  2. Before being called, the node is running traffic controller in active mode, but the node operator wants to elevate the policy restriction by adjusting the threshold values.

Note also that this currently is supported for rpc node and validator server traffic control use cases. Bridge node is not yet supported.

Test plan

How did you test the new or updated feature?


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • [ ] Protocol:
  • [ ] Nodes (Validators and Full nodes):
  • [ ] gRPC:
  • [ ] JSON-RPC:
  • [ ] GraphQL:
  • [ ] CLI:
  • [ ] Rust SDK:

williampsmith avatar Jan 15 '25 03:01 williampsmith

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 17, 2025 11:51pm
2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Jun 17, 2025 11:51pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Jun 17, 2025 11:51pm

vercel[bot] avatar Jan 15 '25 03:01 vercel[bot]

To facilitate this, the PR introduces a refactor that allows AuthorityState to hold a reference to TrafficController, and for all sui callsites to reference this rather than instantiating separately. This is ok for our purposes (and indeed is much cleaner) because sui-node enforces that validators do not run json rpc server.

can you expand on this? without the enforcement is there a risk that trafficController blocks JSONRPC requests? We do run this on fullnodes too

it will fail if used to enable traffic control on a node that did not already have it enabled. It will also fail if called when the node is running an allowlist policy.

is this by design? I think it's desirable for validators to enable traffic control only via the api, but I understand if that's adding too much complexity to this PR.

johnjmartin avatar Feb 05 '25 00:02 johnjmartin

To facilitate this, the PR introduces a refactor that allows AuthorityState to hold a reference to TrafficController, and for all sui callsites to reference this rather than instantiating separately. This is ok for our purposes (and indeed is much cleaner) because sui-node enforces that validators do not run json rpc server.

can you expand on this? without the enforcement is there a risk that trafficController blocks JSONRPC requests? We do run this on fullnodes too

Basically with the current formulation, there will only ever be one instance of TrafficController, since it is held in AuthorityState, which is a singleton. However, there are multiple holders of references to AuthorityState, and specifically, there are two that are callsites for traffic controller. One is the json rpc side, and the other is the AuthorityServer (which runs on validator only). So the risk is that if both json rpc and authority server are running on the same sui-node instance, they will both be writing to the same TrafficController instance, which we want to avoid. But the assertion I'm making is that sui-node is always either a validator or an rpc node or neither, which makes this safe. Lmk if you have any reason to disagree with that statement.

it will fail if used to enable traffic control on a node that did not already have it enabled. It will also fail if called when the node is running an allowlist policy.

is this by design? I think it's desirable for validators to enable traffic control only via the api, but I understand if that's adding too much complexity to this PR.

Yeah it would make things pretty complicated indeed, since you would need to change a bunch of references in authority server, for example, to be mutexes in order to create a traffic controller reference where previously there was none. The solution is to have nodes running in dry run mode, and then they can dynamically go from dry-run mode to live mode through this API without issue. There should be no reason for nodes to not at least run dry run mode at this point.

williampsmith avatar Mar 28 '25 22:03 williampsmith

Do we have metrics for the various settings that are modified by this change? if not it might be good to add them, but that can be a separate PR

I added a metric for dry run in https://github.com/MystenLabs/sui/pull/22363. Good idea on adding for spam and error threshold's, I'll add those as well

johnjmartin avatar Jun 17 '25 22:06 johnjmartin