noosphere icon indicating copy to clipboard operation
noosphere copied to clipboard

Cycle in name resolution between mutual spheres

Open jsantell opened this issue 2 years ago • 1 comments

Scenario: Sphere A and Sphere B are in each other's address books. Every ~60 seconds, an entry in a gateway's address book will attempted to be updated. Sphere A's Gateway updates its entry for Sphere B, incrementing Sphere A's version. Sphere B's Gateway sees a new version of Sphere A and updates its entry for Sphere A, incrementing Sphere B's version. Sphere A's Gateway sees a new version of Sphere B, incrementing Sphere A's version....

In the Gateway's fetch handler, on bundle_until_ancestor(), we have a lot of sphere revisions due to this (without any "user interaction"), causing an OOM(??) in our cluster, triggering a gateway restart. Based on current cluster configs, the problematic gateway idles at 40MB, with a Nomad limit @ 100MB, so over the weekend, ~60MB+ of name record noise over the last few days.

jsantell avatar May 16 '23 18:05 jsantell

After in-person discussion, it seems like we ought to address this with the following:

The gateway sync routine is currently designed as a Git-like fetch -> rebase -> push flow. The feedback loop is most likely being caused by the fetch -> rebase causing local history to change, which in turn leads to a push paired with a new link record to be published to the name system. Any peer who consumes the new link record will then create new history, re-publish and cause more updated back at the original sphere. This feedback loop repeats indefinitely, causing a run-away loop of data creation.

So, a simple solution (with marginal trade-offs) might be: short-circuit the feedback loop if we determine that no local changes have been made at the time that we try to sync. If there were no local changes to history at the time of sync, then we note it and when it comes time to "push," we push without sending along a new link record to publish to the name system.

The trade-off here is that we won't publish a new version if the only thing we are doing is adopting new petname resolutions. This is perhaps not ideal, although it also may be closer to user expectations (if there are new versions of peer spheres, they may not want to publish their sphere automatically without having had a chance to note that the peer sphere's changed </handwave>).

cdata avatar May 16 '23 19:05 cdata