aerospike-client-ruby icon indicating copy to clipboard operation
aerospike-client-ruby copied to clipboard

Fix batch_operate for multi-node cluster

Open opti opened this issue 4 months ago • 0 comments

What

That PR fixes an issue of missing records when the client.batch_operate is used with multiple BatchRead on a multi-node setup.

Why

With 2+ nodes, batch_operate with more than one BatchRead record will only update records with results from the first node. Or it updates records with wrong results (from other records).

To demonstrate, the following existing specs will fail when run on a multi-node cluster, but they're OK on a single-node run:

Failures:

  1) Aerospike::Client#batch_operate #BatchRead returns bins specified with operations
     Failure/Error: expect(records[1].result_code).to eql(0)

       expected: 0
            got: -19

       (compared using eql?)
     # ./spec/aerospike/batch_operate_spec.rb:76:in `block (4 levels) in <top (required)>'

  2) Aerospike::Client#batch_operate #BatchRead TTL #BatchRead
     Failure/Error: expect(br2.result_code).to eql(Aerospike::ResultCode::OK)

       expected: 0
            got: -19

       (compared using eql?)
     # ./spec/aerospike/batch_operate_spec.rb:132:in `block (4 levels) in <top (required)>'

Finished in 12.33 seconds (files took 0.21945 seconds to load)
13 examples, 2 failures

Failed examples:

rspec ./spec/aerospike/batch_operate_spec.rb:64 # Aerospike::Client#batch_operate #BatchRead returns bins specified with operations
rspec ./spec/aerospike/batch_operate_spec.rb:113 # Aerospike::Client#batch_operate #BatchRead TTL #BatchRead

Notes

However, I'm not entirely sure the fix is correct, since I don't know what was the original purpose of having all records and batch.records available in BatchOperateCommand. I can see both are used to write the data offset. https://github.com/aerospike/aerospike-client-ruby/blob/fb54d1d0a197d2fcedda87b8c959fa07f795c322/lib/aerospike/command/batch_operate_command.rb#L53-L55 https://github.com/aerospike/aerospike-client-ruby/blob/fb54d1d0a197d2fcedda87b8c959fa07f795c322/lib/aerospike/command/batch_operate_command.rb#L81

At the same time, BatchOperateCommand is created per each batch node:https://github.com/aerospike/aerospike-client-ruby/blob/fb54d1d0a197d2fcedda87b8c959fa07f795c322/lib/aerospike/client.rb#L1002-L1007

So, feedback is appreciated.

But as it stands right now, batch_operate cannot be used to read multiple records in a multi-node setup. It might be also a good idea to run tests in multi-node setup in the github workflow.

opti avatar Oct 24 '24 00:10 opti