valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Improve type safety and refactor dict entry handling

Open PingXie opened this issue 1 year ago • 4 comments

This pull request introduces several changes to improve the type safety, and clarity of Valkey's dictionary implementation:

  • Getter/Setter Macros: Implemented macros like DICT_SET_KEY, DICT_SET_VALUE, and DICT_GET_VALUE to centralize type casting within these macros. This change emulates the behavior of C++ templates in C, limiting type casting to specific low-level operations and preventing it from being spread across the codebase.

  • Reduced Assert Overhead: Removed unnecessary asserts from critical hot paths in the dictionary implementation.

  • Consistent Naming: Standardized the naming of dictionary entry types. For example, all dictionary entry types start their names with dictEntry.

  • Internal Naming Conventions: Removed the use of underscores (_) for internal static structures/functions.

  • Descriptive Function Names: Updated function names to be more descriptive, making their purpose clearer. For instance, _dictExpand is renamed to dictExpandIfAutoResizeAllowed.

Fix #737

PingXie avatar Jul 05 '24 02:07 PingXie

Codecov Report

Attention: Patch coverage is 83.92857% with 9 lines in your changes missing coverage. Please review.

Project coverage is 70.55%. Comparing base (b48596a) to head (e4bb137). Report is 34 commits behind head on unstable.

Files with missing lines Patch % Lines
src/dict.c 83.92% 9 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #749      +/-   ##
============================================
- Coverage     70.58%   70.55%   -0.04%     
============================================
  Files           112      114       +2     
  Lines         61528    61634     +106     
============================================
+ Hits          43431    43483      +52     
- Misses        18097    18151      +54     
Files with missing lines Coverage Δ
src/dict.c 97.26% <83.92%> (-0.28%) :arrow_down:

... and 35 files with indirect coverage changes

codecov[bot] avatar Jul 05 '24 03:07 codecov[bot]

I don't think the macros like DICT_SET_KEY, etc. improve type safety. Rather the opposite I think, and they don't help readability.

They help reduce the use of type casting to a minimum set of code (without the more expensive rewrite). They are not meant to be updated often, given their reusability. I think it is a good trade off (until we decide to rewrite duct.c from scratch)

PingXie avatar Jul 08 '24 22:07 PingXie

Just to comment on the performance aspect [even if we ignore the different type issue @PingXie mentioned]. I've seen many cases where inlining is not working as good as we think. Worth checking the objdump. Maybe the better equivalent alternative to macros in terms of performance should be using __attribute__((always_inline)). But, probably for a reason, that does not always compile

zvi-code avatar Jul 30 '24 18:07 zvi-code

This PR is ready for (final) review.

@hpatro I didn't do performance benchmarking. I think we will most likely measure noise. Here is the assembly code before and after this PR for dictSetVal. The difference is immaterial IMO. dictSetKey is similar.

before

 21052 000000000008aa00 <dictSetVal>:
 21053    8aa00:>--48 89 f0             >--mov    rax,rsi
 21054    8aa03:>--83 e0 07             >--and    eax,0x7
 21055    8aa06:>--74 28                >--je     8aa30 <dictSetVal+0x30>
 21056    8aa08:>--48 83 ec 08          >--sub    rsp,0x8
 21057    8aa0c:>--48 83 f8 04          >--cmp    rax,0x4
 21058    8aa10:>--74 26                >--je     8aa38 <dictSetVal+0x38>
 21059    8aa12:>--ba 79 03 00 00       >--mov    edx,0x379
 21060    8aa17:>--48 8d 35 79 3e 26 00 >--lea    rsi,[rip+0x263e79]        # 2ee897 <__PRETTY_FUNCTION__.8+0x17>
 21061    8aa1e:>--48 8d 3d fb 47 21 00 >--lea    rdi,[rip+0x2147fb]        # 29f220 <_IO_stdin_used+0x220>
 21062    8aa25:>--e8 76 5b 0a 00       >--call   1305a0 <_serverAssert>
 21063    8aa2a:>--66 0f 1f 44 00 00    >--nop    WORD PTR [rax+rax*1+0x0]
 21064    8aa30:>--48 89 56 08          >--mov    QWORD PTR [rsi+0x8],rdx
 21065    8aa34:>--c3                   >--ret
 21066    8aa35:>--0f 1f 00             >--nop    DWORD PTR [rax]
 21067    8aa38:>--40 f6 c6 01          >--test   sil,0x1
 21068    8aa3c:>--75 12                >--jne    8aa50 <dictSetVal+0x50>
 21069    8aa3e:>--48 83 e6 f8          >--and    rsi,0xfffffffffffffff8
 21070    8aa42:>--48 89 16             >--mov    QWORD PTR [rsi],rdx
 21071    8aa45:>--48 83 c4 08          >--add    rsp,0x8
 21072    8aa49:>--c3                   >--ret
 21073    8aa4a:>--66 0f 1f 44 00 00    >--nop    WORD PTR [rax+rax*1+0x0]
 21074    8aa50:>--ba ce 00 00 00       >--mov    edx,0xce
 21075    8aa55:>--48 8d 35 3b 3e 26 00 >--lea    rsi,[rip+0x263e3b]        # 2ee897 <__PRETTY_FUNCTION__.8+0x17>
 21076    8aa5c:>--48 8d 3d cf 47 21 00 >--lea    rdi,[rip+0x2147cf]        # 29f232 <_IO_stdin_used+0x232>
 21077    8aa63:>--e8 38 5b 0a 00       >--call   1305a0 <_serverAssert>
 21078    8aa68:>--0f 1f 84 00 00 00 00 >--nop    DWORD PTR [rax+rax*1+0x0]
 21079    8aa6f:>--00

after

22489 000000000008bd40 <dictSetVal.part.0>:
22490    8bd40:>--55                   >--push   rbp
22491    8bd41:>--ba ac 03 00 00       >--mov    edx,0x3ac
22492    8bd46:>--48 8d 35 8a 10 27 00 >--lea    rsi,[rip+0x27108a]        # 2fcdd7 <__PRETTY_FUNCTION__.8+0x17>
22493    8bd4d:>--48 8d 3d ec 14 22 00 >--lea    rdi,[rip+0x2214ec]        # 2ad240 <_IO_stdin_used+0x240>
22494    8bd54:>--48 89 e5             >--mov    rbp,rsp
22495    8bd57:>--e8 e4 ba 0a 00       >--call   137840 <_serverAssert>
22496    8bd5c:>--0f 1f 40 00          >--nop    DWORD PTR [rax+0x0]
22497
22498 000000000008bd60 <dictSetVal>:
22499    8bd60:>--48 89 f0             >--mov    rax,rsi
22500    8bd63:>--83 e0 07             >--and    eax,0x7
22501    8bd66:>--75 10                >--jne    8bd78 <dictSetVal+0x18>
22502    8bd68:>--48 83 e6 f8          >--and    rsi,0xfffffffffffffff8
22503    8bd6c:>--48 89 56 08          >--mov    QWORD PTR [rsi+0x8],rdx
22504    8bd70:>--c3                   >--ret
22505    8bd71:>--0f 1f 80 00 00 00 00 >--nop    DWORD PTR [rax+0x0]
22506    8bd78:>--48 83 f8 04          >--cmp    rax,0x4
22507    8bd7c:>--75 08                >--jne    8bd86 <dictSetVal+0x26>
22508    8bd7e:>--48 83 e6 f8          >--and    rsi,0xfffffffffffffff8
22509    8bd82:>--48 89 16             >--mov    QWORD PTR [rsi],rdx
22510    8bd85:>--c3                   >--ret
22511    8bd86:>--55                   >--push   rbp
22512    8bd87:>--48 89 e5             >--mov    rbp,rsp
22513    8bd8a:>--e8 b1 ff ff ff       >--call   8bd40 <dictSetVal.part.0>
22514    8bd8f:>--90                   >--nop

PingXie avatar Aug 20 '24 06:08 PingXie

I'm going to ask again if we can split formatting and functional changes. There are a lot changes in this PR that are non-functional refactoring (moving functions around, removing _, etc) and not functional. This makes it much harder to review, and I don't think we should be merging functional changes and refactoring. I'm 100% okay merging refactoring changes, but don't want them coupled. If you are already douching the code, the refactoring is probably fine.

I thought about that. I don't see this PR being needed for back-porting to 7.X. And if we are making refectoring changes, it will be good to keep 8.0 and unstable in sync so future backporting changes would be easy. So assuming we are going to get both changes (the macros and the code hygiene improvements) to 8.0, I don't see much difference between one and two PRs.

That said, I am fine with splitting the PR into two too. How about I keep the macros and related struct names (dictEntryNormal, etc) for the first one and the remaining goes to the second one?

PingXie avatar Aug 23 '24 23:08 PingXie

I thought about that. I don't see this PR being needed for back-porting to 7.X. And if we are making refectoring changes, it will be good to keep 8.0 and unstable in sync so future backporting changes would be easy. So assuming we are going to get both changes (the macros and the code hygiene improvements) to 8.0, I don't see much difference between one and two PRs.

Yeah. The main point is basically just difficulty in back porting which might require extra work to pull out the functional change as well as it's harder to understand the overall change when looking back because there are extra changes.

That said, I am fine with splitting the PR into two too. How about I keep the macros and related struct names (dictEntryNormal, etc) for the first one and the remaining goes to the second one?

Yeah, this is perfect.

madolson avatar Aug 24 '24 19:08 madolson

Thanks, @madolson!

PingXie avatar Sep 03 '24 01:09 PingXie