uProxy-p2p icon indicating copy to clipboard operation
uProxy-p2p copied to clipboard

Make reporting calls idempotent

Open fortuna opened this issue 9 years ago • 5 comments

In case of network errors, the client may issue a report request twice, but we don't want to record it twice.

In any case, the client will need to generate a unique identifier for the request and pass that as a parameter. We can use a UUID library such as uuid.

On the server side, there are a few possibilities:

  1. Use the unique id as key for the insertion. We would write a request twice, but that's ok, since it won't be duplicated.
  2. Keep a set of "processed ids" in memory and check against that. This is simple, but breaks on server restarts.
  3. Keep a set of "processed ids" in Cloud Datastore. However, the Datastore is eventually consistent by default, so we should check the parameters to increase consistency. This also requires a read before the write.

Any other ideas? #1 seems to have the best robustness/simplicity trade-off.

fortuna avatar Aug 11 '16 20:08 fortuna

Vector clocks? https://en.wikipedia.org/wiki/Vector_clock http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/

jab avatar Aug 11 '16 21:08 jab

Josh, we don't actually care about the ordering of the events. They can be recorded in any order, as long as it only happens one. We also don't need a strongly consistent data store.

fortuna avatar Aug 11 '16 21:08 fortuna

Gotcha. +1 to #1, I see no downside to that. (Were you hoping to use something else for the primary key?)

Not sure how frequently reports are sent, and whether new data would accumulate in between a failed request and a retry, but just in case, would it be worth allowing the client to include multiple reports in the same request, to save unnecessary roundtrips? (Note I'm not proposing that the client have logic to merge multiple reports into a single report, since I take it we probably want to keep all report processing logic on the server.)

jab avatar Aug 11 '16 22:08 jab

I'd keep it a single request for now. However, maybe we should move to a design where we send a JSON object on the body of the request instead of using the url. That would allow us to use more complex structures and more easily switch to batch support, besides hiding url parameters from possible logging along the way.

fortuna avatar Aug 12 '16 21:08 fortuna

+1 to #1.

trevj avatar Aug 22 '16 18:08 trevj