mesh icon indicating copy to clipboard operation
mesh copied to clipboard

Peer restarts can be squelched

Open peterbourgon opened this issue 9 years ago • 0 comments

This is a transplant of https://github.com/weaveworks/weave/issues/1867. Quote,

When a peer restarts quickly, not all other peers necessarily see the peer go away, since topology gossip data merging may combine the removal and addition, or the events get re-ordered in transit, with the removal being skipped (since it will have an earlier version) - both of these effectively just end up updating the peer UID and version. If that happens on just a single peer then the DNS entries of the restarted peer are all retained. This is problematic in two ways:

  • if any containers died on the peer while the weave router was down, their entries are leaked.
  • for surviving containers, the version of the re-created entry may well be lower than the existing one, e.g. if previously the entry had been tombstoned and resurrected a few times. If the last version of the entry known by surviving peers was a tombstone, then this will effectively wipe out the re-created entry.

Possible fix,

Peers could invoke the OnGC callbacks when the UID of a peer changes.

peterbourgon avatar Feb 19 '16 18:02 peterbourgon