valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Reply Copy Avoidance

Open xbasel opened this issue 6 months ago • 3 comments

Overview

This PR introduces the ability to avoid copying whole content of string object into replies (i.e. bulk string replies) and to allow I/O threads refer directly to obj->ptr in writev iov as described at #1353.

Key Changes

  • Added capability to reply construction allowing to interleave regular replies with copy avoid replies in client reply buffers
  • Extended write-to-client handlers to support copy avoid replies
  • Added copy avoidance of string bulk replies when copy avoidance indicated by I/O threads
  • Minor changes in cluster slots stats in order to support network-bytes-out for copy avoid replies
  • Copy avoidance is beneficial for performance despite object size only starting certain number of threads. So it will be enabled only starting certain number of threads. Internal configuration min-io-threads-copy-avoid introduced to manage this number of threads

Note: When copy avoidance disabled content and handling of client reply buffers remains as before this PR

Implementation Details

client and clientReplyBlock structs:

  1. buf_encoded flag has been added to clientReplyBlock struct and to client struct for static c->buf to indicate if reply buffer is in copy avoidance mode (i.e. include headers and payloads) or not (i.e. plain replies only).
  2. io_last_written_buf, io_last_written_bufpos, io_last_written_data_len fields added client struct to to keep track of write state between writevToClient invocations

Reply construction:

  1. Original _addReplyToBuffer and _addReplyProtoToList have been renamed to _addReplyPayloadToBuffer and _addReplyPayloadToList and extended to support different types of payloads - regular replies and copy avoid replies.
  2. New _addReplyToBuffer and _addReplyProtoToList calls now _addReplyPayloadToBuffer and _addReplyPayloadToList and used for adding regular replies to client reply buffers.
  3. Newly introduced _addBulkOffloadToBuffer and _addBulkOffloadToList are used for adding copy avoid replies to client reply buffers.

Write-to-client infrastructure:

The writevToClient and _postWriteToClient has been significantly changed to support copy avoidance capability.

Internal configuration:

  1. min-io-threads-avoid-copy-reply - Minimum number of IO threads for copy avoidance
  2. min-string-size-avoid-copy-reply - Minimum bulk string size for copy avoidance when IO threads disabled
  3. min-string-size-avoid-copy-reply-threaded - Minimum bulk string size for copy avoidance when IO threads enabled

Testing

  1. Existing unit and integration tests passed. Copy avoidance enabled on tests with --io-threads flag
  2. Added unit tests for copy avoidance functionality

Performance Tests

Note: pay attention io-threads 1 config means only main thread with no additional io-threads, io-threads 2 means main thread plus 1 I/O thread, io-threads 9 means main thread plus 8 I/O threads.

512 byte object size

Tests are conducted on memory optimized instances using:

  • 3,000,000 keys
  • 512 bytes object size
  • 1000 clients
io-threads (including main thread) Plain Reply Copy Avoidance
7 1,160,000 1,160,000
8 1,150,000 1,280,000
9 1,150,000 1,330,000
10 N/A 1,380,000
11 N/A 1,420,000

Various object size, small number of threads

iothreads Data size Keys Clients Instance type Unstable branch Copy Avoidance On
1 512 byte 3,000,000 1,000 memory optimized 195,000 195,000
2 512 byte 3,000,000 1,000 memory optimized 245,000 245,000
3 512 byte 3,000,000 1,000 memory optimized 455,000 459,000
4 512 byte 3,000,000 1,000 memory optimized 685,000 685,000
1 1K 3,000,000 1,000 memory optimized 185,000 185,000
2 1K 3,000,000 1,000 memory optimized 235,000 235,000
3 1K 3,000,000 1,000 memory optimized 450,000 450,000
1 4K 1,000,000 1,000 network optimized 182,000 187,000
2 4K 1,000,000 1,000 network optimized 240,000 238,000
1 16K 1,000,000 500 network optimized 100,000 120,000
2 16K 1,000,000 500 network optimized 140,000 140,000
3 16K 1,000,000 500 network optimized 275,000 260,000
1 32K 500,000 500 network optimized 57,000 90,000
2 32K 500,000 500 network optimized 110,000 110,000
3 32K 500,000 500 network optimized 215,000 215,000
1 64K 100,000 500 network optimized 30,000 57,000
2 64K 100,000 500 network optimized 69,000 61,000
3 64K 100,000 500 network optimized 120,000 120,000
4 64K 100,000 500 network optimized 115,000 - 175,000 175,000
5 64K 100,000 500 network optimized 115,000 - 165,000 230,000

xbasel avatar May 13 '25 10:05 xbasel

Codecov Report

Attention: Patch coverage is 91.63880% with 25 lines in your changes missing coverage. Please review.

Project coverage is 71.51%. Comparing base (26b6202) to head (82610aa). Report is 6 commits behind head on unstable.

Files with missing lines Patch % Lines
src/networking.c 93.11% 19 Missing :warning:
src/io_threads.c 0.00% 4 Missing :warning:
src/memory_prefetch.c 0.00% 2 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2078      +/-   ##
============================================
+ Coverage     71.29%   71.51%   +0.21%     
============================================
  Files           122      122              
  Lines         66196    66410     +214     
============================================
+ Hits          47197    47490     +293     
+ Misses        18999    18920      -79     
Files with missing lines Coverage Δ
src/cluster_slot_stats.c 94.21% <100.00%> (-0.17%) :arrow_down:
src/config.c 78.52% <ø> (ø)
src/replication.c 86.72% <100.00%> (+0.08%) :arrow_up:
src/server.c 87.93% <100.00%> (-0.10%) :arrow_down:
src/server.h 100.00% <ø> (ø)
src/memory_prefetch.c 12.21% <0.00%> (-0.29%) :arrow_down:
src/io_threads.c 35.00% <0.00%> (-0.34%) :arrow_down:
src/networking.c 88.18% <93.11%> (+0.59%) :arrow_up:

... and 10 files with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar May 13 '25 10:05 codecov[bot]

Clone of https://github.com/valkey-io/valkey/pull/1457 + test fixes

xbasel avatar May 15 '25 00:05 xbasel

So Sasha is not going to work on https://github.com/valkey-io/valkey/pull/1457 anymore? We should close it then to avoid confusion.

zuiderkwast avatar May 15 '25 07:05 zuiderkwast

The schema validation seems to be failing consistently. It's not reporting which test, but it seems to be because its expecting an integer reply and getting nothing.

madolson avatar Jun 04 '25 02:06 madolson

Also checked to see if there were any test failures if you force everything to offload, and just the ones we expected:

*** [err]: COMMANDLOG - only logs commands exceeding the threshold in tests/unit/commandlog.tcl
Expected '0' to be equal to '1' (context: type source line 33 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [r commandlog len large-reply] 1} proc ::test)
*** [err]: COMMANDLOG - logged entry sanity check in tests/unit/commandlog.tcl
Expected '0' to be equal to '6' (context: type source line 127 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [llength $e] 6} proc ::test)
*** [err]: No response for multi commands in pipeline if client output buffer limit is enforced in tests/unit/obuf-limits.tcl
Expected 'id=3 addr=127.0.0.1:53988 laddr=127.0.0.1:26114 fd=12 name= age=19 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=26 qbuf-free=16864 argv-mem=10 multi-mem=0 rbs=4096 rbp=2048 obl=0 oll=0 omem=0 tot-mem=21578 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=8485608 tot-net-out=19895313 tot-cmds=57118
id=8 addr=127.0.0.1:54274 laddr=127.0.0.1:26114 fd=13 name= age=1 idle=1 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=16384 rbp=5 obl=0 oll=0 omem=0 tot-mem=16944 events=r cmd=debug user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=56 tot-net-out=10 tot-cmds=2
id=9 addr=127.0.0.1:54275 laddr=127.0.0.1:26114 fd=14 name=multicommands age=1 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=16890 argv-mem=0 multi-mem=0 rbs=16384 rbp=16 obl=0 oll=0 omem=0 tot-mem=33840 events=r cmd=get user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=617 tot-net-out=150405 tot-cmds=62
' to not match '*name=multicommands*' (context: type source line 187 file /Users/madolson/git/valkey/tests/unit/obuf-limits.tcl cmd {assert_no_match "*name=multicommands*" $clients} proc ::test)

So for the schema, I think we just need to run it locally and see what test is actually failing (it'll be in the output file) and then we can check what is not working correctly.

madolson avatar Jun 04 '25 02:06 madolson

Also checked to see if there were any test failures if you force everything to offload, and just the ones we expected:

*** [err]: COMMANDLOG - only logs commands exceeding the threshold in tests/unit/commandlog.tcl
Expected '0' to be equal to '1' (context: type source line 33 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [r commandlog len large-reply] 1} proc ::test)
*** [err]: COMMANDLOG - logged entry sanity check in tests/unit/commandlog.tcl
Expected '0' to be equal to '6' (context: type source line 127 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [llength $e] 6} proc ::test)
*** [err]: No response for multi commands in pipeline if client output buffer limit is enforced in tests/unit/obuf-limits.tcl
Expected 'id=3 addr=127.0.0.1:53988 laddr=127.0.0.1:26114 fd=12 name= age=19 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=26 qbuf-free=16864 argv-mem=10 multi-mem=0 rbs=4096 rbp=2048 obl=0 oll=0 omem=0 tot-mem=21578 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=8485608 tot-net-out=19895313 tot-cmds=57118
id=8 addr=127.0.0.1:54274 laddr=127.0.0.1:26114 fd=13 name= age=1 idle=1 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=16384 rbp=5 obl=0 oll=0 omem=0 tot-mem=16944 events=r cmd=debug user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=56 tot-net-out=10 tot-cmds=2
id=9 addr=127.0.0.1:54275 laddr=127.0.0.1:26114 fd=14 name=multicommands age=1 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=16890 argv-mem=0 multi-mem=0 rbs=16384 rbp=16 obl=0 oll=0 omem=0 tot-mem=33840 events=r cmd=get user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=617 tot-net-out=150405 tot-cmds=62
' to not match '*name=multicommands*' (context: type source line 187 file /Users/madolson/git/valkey/tests/unit/obuf-limits.tcl cmd {assert_no_match "*name=multicommands*" $clients} proc ::test)

So for the schema, I think we just need to run it locally and see what test is actually failing (it'll be in the output file) and then we can check what is not working correctly.

  1. Reply schema fails because logreqres copies reply as is (with points). We can disable copy avoidance when testing reply schema
  2. The failures above are related to output buffer sizes

xbasel avatar Jun 04 '25 21:06 xbasel

The undefined-sanitizer fails in Daily with an unaligned pointer address. I'm just guessing it can be related to this PR. Is it?

https://github.com/valkey-io/valkey/actions/runs/15547432999/job/43771545681#step:10:301

unit/test_networking.c:481:5: runtime error: load of misaligned address 0x55f458457063 for type 'struct robj *', which requires 8 byte alignment
0x55f458457063: note: pointer points here
 00  01 ff ff 30 83 42 58 f4  55 00 00 23 41 45 58 f4  55 00 00 00 00 00 00 00  00 00 00 00 00 00 00
              ^ 
List size:     8192, bytes:    49159,

zuiderkwast avatar Jun 10 '25 07:06 zuiderkwast

The undefined-sanitizer fails in Daily with an unaligned pointer address. I'm just guessing it can be related to this PR. Is it?

Seems likely. @xbasel Can you take a look?

madolson avatar Jun 10 '25 15:06 madolson

The undefined-sanitizer fails in Daily with an unaligned pointer address. I'm just guessing it can be related to this PR. Is it?

Seems likely. @xbasel Can you take a look?

https://github.com/valkey-io/valkey/pull/2201

xbasel avatar Jun 11 '25 13:06 xbasel