nats-architecture-and-design icon indicating copy to clipboard operation
nats-architecture-and-design copied to clipboard

KeyValue atomic delete and purge methods.

Open davidmcote opened this issue 11 months ago • 7 comments

Motivation

NATS clients provide the KeyValue interface with methods for optimistic atomic updates. (Eg: In Java, KeyValue.update(key, value, expectedRevision))

There doesn’t appear to be a way to specify an expectedRevision for delete() or purge(), neither in Java nor in the NATS cli.

However, I am able to craft a nats req for KV-Operation:DEL which specifies Nats-Expected-Last-Subject-Sequence and works in the expected manner: only writing a tombstone when the expectedRevision still matches.

% nats req -H KV-Operation:DEL -H Nats-Expected-Last-Subject-Sequence:6 '$KV.mybucket.key' ''
09:45:30 Sending request on "$KV.mybucket.key"
09:45:30 Received with rtt 345.313µs
{"error":{"code":400,"err_code":10071,"description":"wrong last sequence: 9"},"stream":"KV_mybucket","seq":0}

% nats req -H KV-Operation:DEL -H Nats-Expected-Last-Subject-Sequence:9 '$KV.mybucket.key' ''
09:45:35 Sending request on "$KV.mybucket.key"
09:45:35 Received with rtt 408.754µs
{"stream":"KV_mybucket", "seq":10}

There doesn't appear to be any technical limitations in JetStream for supporting an atomic delete on KV keys.

Indeed, it looks like this capability is already present in the go client: https://github.com/nats-io/nats.go/blob/main/jetstream/kv.go#L134

// [LastRevision] option can be specified to only perform delete if the
// latest revision the provided one.
Delete(ctx context.Context, key string, opts ...KVDeleteOpt) error
Purge(ctx context.Context, key string, opts ...KVDeleteOpt) error

Proposed change

Define two new methods in the KeyValue interface and corresponding implementations of these methods in NatsKeyValue.

void delete(String key, long expectedRevision) throws IOException, JetStreamApiException;
void purge(String key, long expectedRevision) throws IOException, JetStreamApiException;

These methods would enforce expectedRevision in the same manner as long update(String key, byte[] value, long expectedRevision) : by setting EXPECTED_LAST_SUB_SEQ_HDR and propagating the exception from _write() if the JetStream server rejects the request.

Use case

Optimistic/atomic delete & purge of a KV entry.

As a specific example: A simple distributed locking mechanism can be built upon a KV bucket with TTL with the following rules:

  • Clients may acquire a lock IFF a particular key is not already present in the KV (atomic create.)
  • Clients may renew a lock IFF they originally acquired it (atomic update.)
  • Clients may release a lock IFF they currently hold it (atomic delete - missing!)

Without an atomic delete, clients must instead release their lock by either using the non-atomic delete() method, or passively allowing the TTL to expire.

This opens up a race condition where:

  1. A client acquires a lock and unintentionally lapses the TTL.
  2. A second process atomic-creates a new entry to acquire the lock.
  3. The first client calls delete(), expecting to relinquish its own lock.
  4. A third process atomic-creates a new entry to acquire the lock, resulting in two processes thinking they hold the lock simultaneously.

Contribution

My team has implemented our own atomic delete for KeyValue using the lower level JetStream.publish() method in our application which seems to be working for us. But we'd really prefer to be using an "official" KeyValue interface.

If NATS maintainers agree with the implementation proposed above or have suggestions on the approach, we could submit this contribution.

EDIT (by @Jarema ):

As we accepted this to be part of all clients, I'm adding standard checklist:

Clients and Tools

  • [ ] Schemas @ripienaar
  • [ ] CLI @ripienaar
  • [ ] Terraform @ripienaar
  • [x] Go @piotrpio
  • [x] Java @scottf https://github.com/nats-io/nats.java/pull/1092
  • [x] JavaScript @aricart https://github.com/nats-io/nats.deno/pull/656
  • [x] .Net @scottf https://github.com/nats-io/nats.net/pull/872
  • [ ] C @levb
  • [ ] Python @wallyqs
  • [ ] Ruby @wallyqs
  • [ ] Rust @Jarema
  • [ ] .Net V2 @mtmk

Other Tasks

  • [ ] docs.nats.io updated @bruth
  • [ ] Update ADR to Implemented
  • [ ] Update client features spreadsheet

Client authors please update with your progress. If you open issues in your own repositories as a result of this request, please link them to this one by pasting the issue URL in a comment or main issue description.

davidmcote avatar Feb 29 '24 23:02 davidmcote

Moving this to the architecture repo since it's feature request that would apply to all clients.

scottf avatar Mar 01 '24 05:03 scottf

Thanks Scott! That makes sense.

By the way, I found that at least the go client already has this capability. I've edited the proposal to include this detail.

Our offer to submit a contribution to the Java client is standing if the maintainers confirm the proposed approach.

davidmcote avatar Mar 01 '24 12:03 davidmcote

Yes happy to have this everywhere if go has it already 👍

ripienaar avatar Mar 01 '24 12:03 ripienaar

@davidmcote Yes, contributions are encouraged! Please make sure to verify your commits. I made an issue in the repo: https://github.com/nats-io/nats.java/issues/1091

scottf avatar Mar 01 '24 13:03 scottf

Thanks for pointing this out @davidmcote! That would be an awesome addition, as I already use this for deletes quite frequently in my own projects

codegangsta avatar Mar 01 '24 18:03 codegangsta

agree. This is a good think to have across the languages.

@davidmcote I allowed myself to edit your issue and add our standard checks for cross-language support. Thanks for reporting this!

Jarema avatar Mar 01 '24 18:03 Jarema

done for the JavaScript clients. https://github.com/nats-io/nats.deno/pull/656

aricart avatar Mar 01 '24 18:03 aricart