valkey
valkey copied to clipboard
Reply Copy Avoidance
Overview
This PR introduces the ability to avoid copying whole content of string object into replies (i.e. bulk string replies) and to allow I/O threads refer directly to obj->ptr in writev iov as described at #1353.
Key Changes
- Added capability to reply construction allowing to interleave regular replies with copy avoid replies in client reply buffers
- Extended write-to-client handlers to support copy avoid replies
- Added copy avoidance of string bulk replies when copy avoidance indicated by I/O threads
- Minor changes in cluster slots stats in order to support
network-bytes-outfor copy avoid replies - Copy avoidance is beneficial for performance despite object size only starting certain number of threads. So it will be enabled only starting certain number of threads. Internal configuration
min-io-threads-copy-avoidintroduced to manage this number of threads
Note: When copy avoidance disabled content and handling of client reply buffers remains as before this PR
Implementation Details
client and clientReplyBlock structs:
buf_encodedflag has been added toclientReplyBlockstruct and toclientstruct for staticc->bufto indicate if reply buffer is in copy avoidance mode (i.e. include headers and payloads) or not (i.e. plain replies only).io_last_written_buf,io_last_written_bufpos,io_last_written_data_lenfields addedclientstruct to to keep track of write state betweenwritevToClientinvocations
Reply construction:
- Original
_addReplyToBufferand_addReplyProtoToListhave been renamed to_addReplyPayloadToBufferand_addReplyPayloadToListand extended to support different types of payloads - regular replies and copy avoid replies. - New
_addReplyToBufferand_addReplyProtoToListcalls now_addReplyPayloadToBufferand_addReplyPayloadToListand used for adding regular replies to client reply buffers. - Newly introduced
_addBulkOffloadToBufferand_addBulkOffloadToListare used for adding copy avoid replies to client reply buffers.
Write-to-client infrastructure:
The writevToClient and _postWriteToClient has been significantly changed to support copy avoidance capability.
Internal configuration:
min-io-threads-avoid-copy-reply- Minimum number of IO threads for copy avoidancemin-string-size-avoid-copy-reply- Minimum bulk string size for copy avoidance when IO threads disabledmin-string-size-avoid-copy-reply-threaded- Minimum bulk string size for copy avoidance when IO threads enabled
Testing
- Existing unit and integration tests passed. Copy avoidance enabled on tests with
--io-threadsflag - Added unit tests for copy avoidance functionality
Performance Tests
Note: pay attention io-threads 1 config means only main thread with no additional io-threads, io-threads 2 means main thread plus 1 I/O thread, io-threads 9 means main thread plus 8 I/O threads.
512 byte object size
Tests are conducted on memory optimized instances using:
- 3,000,000 keys
- 512 bytes object size
- 1000 clients
| io-threads (including main thread) | Plain Reply | Copy Avoidance |
|---|---|---|
| 7 | 1,160,000 | 1,160,000 |
| 8 | 1,150,000 | 1,280,000 |
| 9 | 1,150,000 | 1,330,000 |
| 10 | N/A | 1,380,000 |
| 11 | N/A | 1,420,000 |
Various object size, small number of threads
| iothreads | Data size | Keys | Clients | Instance type | Unstable branch | Copy Avoidance On |
|---|---|---|---|---|---|---|
| 1 | 512 byte | 3,000,000 | 1,000 | memory optimized | 195,000 | 195,000 |
| 2 | 512 byte | 3,000,000 | 1,000 | memory optimized | 245,000 | 245,000 |
| 3 | 512 byte | 3,000,000 | 1,000 | memory optimized | 455,000 | 459,000 |
| 4 | 512 byte | 3,000,000 | 1,000 | memory optimized | 685,000 | 685,000 |
| 1 | 1K | 3,000,000 | 1,000 | memory optimized | 185,000 | 185,000 |
| 2 | 1K | 3,000,000 | 1,000 | memory optimized | 235,000 | 235,000 |
| 3 | 1K | 3,000,000 | 1,000 | memory optimized | 450,000 | 450,000 |
| 1 | 4K | 1,000,000 | 1,000 | network optimized | 182,000 | 187,000 |
| 2 | 4K | 1,000,000 | 1,000 | network optimized | 240,000 | 238,000 |
| 1 | 16K | 1,000,000 | 500 | network optimized | 100,000 | 120,000 |
| 2 | 16K | 1,000,000 | 500 | network optimized | 140,000 | 140,000 |
| 3 | 16K | 1,000,000 | 500 | network optimized | 275,000 | 260,000 |
| 1 | 32K | 500,000 | 500 | network optimized | 57,000 | 90,000 |
| 2 | 32K | 500,000 | 500 | network optimized | 110,000 | 110,000 |
| 3 | 32K | 500,000 | 500 | network optimized | 215,000 | 215,000 |
| 1 | 64K | 100,000 | 500 | network optimized | 30,000 | 57,000 |
| 2 | 64K | 100,000 | 500 | network optimized | 69,000 | 61,000 |
| 3 | 64K | 100,000 | 500 | network optimized | 120,000 | 120,000 |
| 4 | 64K | 100,000 | 500 | network optimized | 115,000 - 175,000 | 175,000 |
| 5 | 64K | 100,000 | 500 | network optimized | 115,000 - 165,000 | 230,000 |
Codecov Report
Attention: Patch coverage is 91.63880% with 25 lines in your changes missing coverage. Please review.
Project coverage is 71.51%. Comparing base (
26b6202) to head (82610aa). Report is 6 commits behind head on unstable.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/networking.c | 93.11% | 19 Missing :warning: |
| src/io_threads.c | 0.00% | 4 Missing :warning: |
| src/memory_prefetch.c | 0.00% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## unstable #2078 +/- ##
============================================
+ Coverage 71.29% 71.51% +0.21%
============================================
Files 122 122
Lines 66196 66410 +214
============================================
+ Hits 47197 47490 +293
+ Misses 18999 18920 -79
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/cluster_slot_stats.c | 94.21% <100.00%> (-0.17%) |
:arrow_down: |
| src/config.c | 78.52% <ø> (ø) |
|
| src/replication.c | 86.72% <100.00%> (+0.08%) |
:arrow_up: |
| src/server.c | 87.93% <100.00%> (-0.10%) |
:arrow_down: |
| src/server.h | 100.00% <ø> (ø) |
|
| src/memory_prefetch.c | 12.21% <0.00%> (-0.29%) |
:arrow_down: |
| src/io_threads.c | 35.00% <0.00%> (-0.34%) |
:arrow_down: |
| src/networking.c | 88.18% <93.11%> (+0.59%) |
:arrow_up: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Clone of https://github.com/valkey-io/valkey/pull/1457 + test fixes
So Sasha is not going to work on https://github.com/valkey-io/valkey/pull/1457 anymore? We should close it then to avoid confusion.
The schema validation seems to be failing consistently. It's not reporting which test, but it seems to be because its expecting an integer reply and getting nothing.
Also checked to see if there were any test failures if you force everything to offload, and just the ones we expected:
*** [err]: COMMANDLOG - only logs commands exceeding the threshold in tests/unit/commandlog.tcl
Expected '0' to be equal to '1' (context: type source line 33 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [r commandlog len large-reply] 1} proc ::test)
*** [err]: COMMANDLOG - logged entry sanity check in tests/unit/commandlog.tcl
Expected '0' to be equal to '6' (context: type source line 127 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [llength $e] 6} proc ::test)
*** [err]: No response for multi commands in pipeline if client output buffer limit is enforced in tests/unit/obuf-limits.tcl
Expected 'id=3 addr=127.0.0.1:53988 laddr=127.0.0.1:26114 fd=12 name= age=19 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=26 qbuf-free=16864 argv-mem=10 multi-mem=0 rbs=4096 rbp=2048 obl=0 oll=0 omem=0 tot-mem=21578 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=8485608 tot-net-out=19895313 tot-cmds=57118
id=8 addr=127.0.0.1:54274 laddr=127.0.0.1:26114 fd=13 name= age=1 idle=1 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=16384 rbp=5 obl=0 oll=0 omem=0 tot-mem=16944 events=r cmd=debug user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=56 tot-net-out=10 tot-cmds=2
id=9 addr=127.0.0.1:54275 laddr=127.0.0.1:26114 fd=14 name=multicommands age=1 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=16890 argv-mem=0 multi-mem=0 rbs=16384 rbp=16 obl=0 oll=0 omem=0 tot-mem=33840 events=r cmd=get user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=617 tot-net-out=150405 tot-cmds=62
' to not match '*name=multicommands*' (context: type source line 187 file /Users/madolson/git/valkey/tests/unit/obuf-limits.tcl cmd {assert_no_match "*name=multicommands*" $clients} proc ::test)
So for the schema, I think we just need to run it locally and see what test is actually failing (it'll be in the output file) and then we can check what is not working correctly.
Also checked to see if there were any test failures if you force everything to offload, and just the ones we expected:
*** [err]: COMMANDLOG - only logs commands exceeding the threshold in tests/unit/commandlog.tcl Expected '0' to be equal to '1' (context: type source line 33 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [r commandlog len large-reply] 1} proc ::test) *** [err]: COMMANDLOG - logged entry sanity check in tests/unit/commandlog.tcl Expected '0' to be equal to '6' (context: type source line 127 file /Users/madolson/git/valkey/tests/unit/commandlog.tcl cmd {assert_equal [llength $e] 6} proc ::test) *** [err]: No response for multi commands in pipeline if client output buffer limit is enforced in tests/unit/obuf-limits.tcl Expected 'id=3 addr=127.0.0.1:53988 laddr=127.0.0.1:26114 fd=12 name= age=19 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=26 qbuf-free=16864 argv-mem=10 multi-mem=0 rbs=4096 rbp=2048 obl=0 oll=0 omem=0 tot-mem=21578 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=8485608 tot-net-out=19895313 tot-cmds=57118 id=8 addr=127.0.0.1:54274 laddr=127.0.0.1:26114 fd=13 name= age=1 idle=1 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=16384 rbp=5 obl=0 oll=0 omem=0 tot-mem=16944 events=r cmd=debug user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=56 tot-net-out=10 tot-cmds=2 id=9 addr=127.0.0.1:54275 laddr=127.0.0.1:26114 fd=14 name=multicommands age=1 idle=0 flags=N capa= db=9 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=16890 argv-mem=0 multi-mem=0 rbs=16384 rbp=16 obl=0 oll=0 omem=0 tot-mem=33840 events=r cmd=get user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=617 tot-net-out=150405 tot-cmds=62 ' to not match '*name=multicommands*' (context: type source line 187 file /Users/madolson/git/valkey/tests/unit/obuf-limits.tcl cmd {assert_no_match "*name=multicommands*" $clients} proc ::test)So for the schema, I think we just need to run it locally and see what test is actually failing (it'll be in the output file) and then we can check what is not working correctly.
- Reply schema fails because logreqres copies reply as is (with points). We can disable copy avoidance when testing reply schema
- The failures above are related to output buffer sizes
The undefined-sanitizer fails in Daily with an unaligned pointer address. I'm just guessing it can be related to this PR. Is it?
https://github.com/valkey-io/valkey/actions/runs/15547432999/job/43771545681#step:10:301
unit/test_networking.c:481:5: runtime error: load of misaligned address 0x55f458457063 for type 'struct robj *', which requires 8 byte alignment
0x55f458457063: note: pointer points here
00 01 ff ff 30 83 42 58 f4 55 00 00 23 41 45 58 f4 55 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
List size: 8192, bytes: 49159,
The undefined-sanitizer fails in Daily with an unaligned pointer address. I'm just guessing it can be related to this PR. Is it?
Seems likely. @xbasel Can you take a look?
The undefined-sanitizer fails in Daily with an unaligned pointer address. I'm just guessing it can be related to this PR. Is it?
Seems likely. @xbasel Can you take a look?
https://github.com/valkey-io/valkey/pull/2201