keysync icon indicating copy to clipboard operation
keysync copied to clipboard

More robust network backend

Open mcpherrinm opened this issue 8 years ago • 8 comments

We'd like keysync to handle several server failure scenarios, so we need a more robust backend.

  • Retry support
    • During a sync, we should try to fetch the secret list multiple times if it fails
    • We should try to fetch each secret multiple times before moving on
  • Failover support
    • If there are too many consecutive failures talking to a server, we should try a second one
    • Probably an MX-record like weighted priority list.
  • backoff between failover/retries
  • Any one sync should occur against the same server
    • Avoids issues with lagging mysql replication and inconsistent server view-of-the-world
  • info on individual retries
  • warn on failover
  • error if all servers fail

mcpherrinm avatar Jun 07 '17 06:06 mcpherrinm

A lot of the client is straight from keywhiz-fs and should be refactored to make this easier to implement too

mcpherrinm avatar Jun 07 '17 06:06 mcpherrinm

All the values here should be tweakable via configuration.

mcpherrinm avatar Jun 07 '17 07:06 mcpherrinm

We probably want a global ratelimit too -- avoid hammering the server too fast, eg if we're asked to sync repeatedly.

mcpherrinm avatar Jun 13 '17 01:06 mcpherrinm

Quick question, why going over a full synchronisation every time ? I feel like it would be much more efficient to trigger a sync based on the /secrets response.

Request the /secrets endpoint at the configured poll interval and only sync. secrets that needs to be synced. You could do that by going over all the secrets from the response payload and only sync secrets having creationDate or updateDate after the last successful sync. (which could be null) or compare their checksum.

Finally, do a 'cleaning' pass where it simply remove files that are present on the filesystem but not in the server response.

madtrax avatar Jun 14 '17 07:06 madtrax

What you describe is already implemented. I am using the term "sync" to refer to that behavior - fetch secrets as-needed based on the server response.

mcpherrinm avatar Jun 14 '17 07:06 mcpherrinm

In particular, the secretState struct, found here, https://github.com/square/keysync/blob/master/syncer.go#L41-L52 is where the data used to make that decision is stored

mcpherrinm avatar Jun 14 '17 07:06 mcpherrinm

Great, I noticed this behaviour from an old version I was playing with few weeks/months ago. I'll check the latest version.

On another note, I believe we should be able to configure the pollIntervalFailureThresholdMultiplier specially if your poll interval is very high.

Thanks!

madtrax avatar Jun 14 '17 07:06 madtrax

Yeah. I think we're going to want a handful of tunables here. I'll probably implement this next week and figure out what exactly those are.

mcpherrinm avatar Jun 14 '17 07:06 mcpherrinm