cosmos-sdk
cosmos-sdk copied to clipboard
fix(orm)!: properly encode/decode nanoseconds in Duration type
Now, nanoseconds are properly encoded/decoded, and the lexicographical order of the bytes is correct when comparing two Duration objects: one with negative nanoseconds and another with zero nanoseconds.
Fixed: #19910
Description
Closes: #19910
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.
I have...
- [x] included the correct type prefix in the PR title
- [x] confirmed
!in the type prefix if API or client breaking change - [x] targeted the correct branch (see PR Targeting)
- [x] provided a link to the relevant issue or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] included the necessary unit and integration tests
- [x] added a changelog entry to
CHANGELOG.md - [x] updated the relevant documentation or specification, including comments for documenting Go code
- [x] confirmed all CI checks have passed
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.
I have...
- [ ] confirmed the correct type prefix in the PR title
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage
Summary by CodeRabbit
- Bug Fixes
- Improved consistency in comparing duration objects by adjusting the encoding of zero, negative, and positive nanoseconds.
Walkthrough
The recent changes in the orm module focus on resolving issues related to the encoding and decoding of nanoseconds in duration objects. By implementing a consistent approach of adding 999,999,999 nanoseconds during encoding, the update ensures accurate comparison of durations, maintains lexicographical order, and addresses discrepancies between direct comparisons and byte comparisons.
Changes
| Files | Change Summary |
|---|---|
orm/CHANGELOG.md, orm/encoding/ormfield/duration.go |
Adjusted nanosecond encoding for consistent duration comparisons, ensuring non-negative values and lexicographical order. |
orm/encoding/ormfield/timestamp.go |
Updated encoding and decoding functions for timestamps to handle edge cases and maintain order in nanoseconds. |
Assessment against linked issues
| Objective (Issue #) | Addressed | Explanation |
|---|---|---|
| Correct encoding/decoding of nanoseconds in durations (#19910) | ✅ | |
| Consistent comparison of durations with zero, negative, and positive nanoseconds (#19910) | ✅ | |
| Updating test cases to validate the changes (#19910) | ✅ |
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Tips
Chat
There are 3 ways to chat with CodeRabbit:
- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
I pushed a fix in commit <commit_id>.Generate unit testing code for this file.Open a follow-up GitHub issue for this discussion.
- Files and specific lines of code (under the "Files changed" tab): Tag
@coderabbitaiin a new review comment at the desired location with your query. Examples:@coderabbitai generate unit testing code for this file.@coderabbitai modularize this function.
- PR comments: Tag
@coderabbitaiin a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:@coderabbitai generate interesting stats about this repository and render them as a table.@coderabbitai show all the console.log statements in this repository.@coderabbitai read src/utils.ts and generate unit testing code.@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.@coderabbitai help me debug CodeRabbit configuration file.
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.
CodeRabbit Commands (invoked as PR comments)
@coderabbitai pauseto pause the reviews on a PR.@coderabbitai resumeto resume the paused reviews.@coderabbitai reviewto trigger an incremental review. This is useful when automatic reviews are disabled for the repository.@coderabbitai full reviewto do a full review from scratch and review all the files again.@coderabbitai summaryto regenerate the summary of the PR.@coderabbitai resolveresolve all the CodeRabbit review comments.@coderabbitai configurationto show the current CodeRabbit configuration for the repository.@coderabbitai helpto get help.
Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
CodeRabbit Configration File (.coderabbit.yaml)
- You can programmatically configure CodeRabbit by adding a
.coderabbit.yamlfile to the root of your repository. - Please see the configuration documentation for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation:
# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
Documentation and Community
- Visit our Documentation for detailed information on how to use CodeRabbit.
- Join our Discord Community to get help, request features, and share feedback.
- Follow us on X/Twitter for updates and announcements.
@aaronc if my PR or description are not clear and you have any questions please ask me and I will try to explain better.
Thanks @EmilGeorgiev. What would be helpful is if you could write out what the encoding format is now (this should be documented anyway) and also how that differs from what was there before, maybe with a couple of specific failure values. I know you've included some of this, but I spent some time looking at this yesterday and for some reason it's still a bit fuzzy.
First let's start from the beginning.
-
How I found the bug - I saw that we have failing test cases "TestCodec/dur_false" and "TestCodec/dur_false" and I started to investigate why they are failing.
-
What TestCodec contains several test cases which are testing whether encoding and decoding of different types ( uint32, uint64, string, int32, int64, bool, timestamp, duration) are working correctly and that the byte representation of every type is lexicographically correct compared to the lexicographically order of bytes of another slice of bytes of the sam type. For example when we compare the values or the byte representation of these values which has the some type 'A' < 'B', 10 < 20, Timestamp{second: 10, nanos: 10} < Timestamp{second: 40, nanos: 40}, and so on. If the first value is less then the second one then this means when we convert the values to bytes again the first byte slice will be lexhicorfically less then the second one
-
Why the test cases for Duration failed? - The problem is that there are some edge cases in which durationA < durationB but when we encode the duration values in bytes then bytesOfDurationA > bytesOfDurationB. For example Duration{Seconds: -5, nanos: -1} < Duration{Seconds: -5, nanos: -1}, but when we encode these to values the bytes of the first one will be greater than hthe bytes of the second one which is not correct. The problem is that in the current implementation when nanos are negative number we set this value for the nanos: nanosInt = DurationNanosMax + nanosInt + 1 ( which I think is very cleaver and interesting decision to solve the problem when the naps are negative ), if the nanosInt is -1 the value for the nanos after this will be nanosInt = 999999999 +(-1) +1 = 999999999. In the same time all nanos that are equal to zero are untouched/unchanged. After that we have nanos value of -1 for the first duration and 0 nanos for the second duration at the end we will 999999999 for the first duration and 0 for the second. 999999999 > 0 and this will lead to different result when we compare two duration types with DurationCodec.Comapre and bytes.Compare(bytes(dur1)m, bytes(dur2)).
-
How we can solve the problem - one possible solution is treat/manipulate nanos in same way independent whether they are negative, zero or positive. I always add 999999999 to the nanos and we guarantee that after that the nanos will be always >= 0. For example we have -1 nanos in the first duration and 0 nanos in the second duration ( -1 < 0 ) when we add 999999999 the values will be 999999998 nanos for the first duration and 999999999 nanos for the second duration. We keep the lexicographically order of the byte representation of these values ( 999999998 < 999999999 ) .
-
How the encoding format looks like in the PR and what is the difference with the current encoding format: First the encoding for the seconds is untouched, only encoding for the nanos is changed. In the current implementation we have one encoding format when nanos are 0 and another when they are > 0: When nanos are 0 then the encoding format is simple, we represent this with one byte : . When nanos are 0 In the PR we add 999999999 to the nanos and the encoding format for the nanos are represented with 4 bytes [0x0] - current encoding format when nanos are zero [0xBB, 0x9A, 0xC9, 0xFF] - encoded format in the PR when nanos are zero
When nanos > 0, for example 1: [0xC0, 0x0, 0x0, 0x1] - the current encoding format [0xBB, 0x9A, 0xCA, 0x00] - the encoding format in the PR. Here, when nanos are equal to 1 In the PR we add 999999999 to the nanos and the value become 1000000000.
Something that we should mention is that in the current implementation we make OR operation to the first byte with 0xC0 to be sure that the first 2 digits of the first byte will be always 11 (11xxxxxx) . In the PR changed 0xC0 with 0x80 and now we are sure that only the first digit of the byte is 1 (1xxxxxxx). I did this because if the nanos are 999999999 we will add to them 999999999 in the Encode method. The new number will be 1999999998 with in binary representation is ( 01110111001101011001001111111110 ) . We need at least 31 digits to represent this number. In contrast in the current implementation we need only 30 digits because the number 999999999 in binary is ( 00111011100110101100100111111111 ) the change from 0xC0 to 0x80 was need to be sure that when we encode/decode the maximum possible number 1999999998, we will get always the same result and we still keep the logic that check whether the bytes contains zero or not which is necessary because encoding and decoding function are used and for timestamp where when nanos are zero the byte representation will be still [0x0].
Conclusion This change simplifies the encoding logic and ensures that the lexicographical order of durations is preserved in their byte representations, addressing the edge cases effectively.
I hope that explain it better now. Sorry for the confusion in the beginning. cc @aaronc
ACK. Makes sense now. Thanks @EmilGeorgiev and apologies for my delay in doing a complete review. This is simpler than the previous implementation and doesn't seem to have any downside in terms of the number of bytes encoded.
The biggest concern with deploying this if someone is using the previous version of this encoding and needs to upgrade. I'm not sure if anyone is, but it's possible. Can you make it clear in the CHANGELOG that there's a breaking change please and that users will need to take care when upgrading and possibly perform a migration?
Thanks @aaronc . Your implementation was very good. I just get your idea and improving it a little bit.
I will mark this as breaking change
I think the one thing we might need here is a migration path for users of the previous encoding version. @EmilGeorgiev are you using the previous version in production?
Even if we don't know of production users though, we should probably lay the grounds for a migration path as a precaution. The first thing that involves is making sure the previous duration and timestamp codecs are available as DurationV1 and TimestampV1. We are using DurationV0 and TimestampV0 in production and will want to upgrade directly to the versions in this PR (which we should consider V2), so I intend to work on some migration code soon either way.
@EmilGeorgiev would it be easy for you to add DurationV1 and TimestampV1 to the PR with the previous encoding? Otherwise I can do that in a separate PR and then tackle the migration.
I think the one thing we might need here is a migration path for users of the previous encoding version. @EmilGeorgiev are you using the previous version in production?
Even if we don't know of production users though, we should probably lay the grounds for a migration path as a precaution. The first thing that involves is making sure the previous duration and timestamp codecs are available as
DurationV1andTimestampV1. We are usingDurationV0andTimestampV0in production and will want to upgrade directly to the versions in this PR (which we should consider V2), so I intend to work on some migration code soon either way.@EmilGeorgiev would it be easy for you to add
DurationV1andTimestampV1to the PR with the previous encoding? Otherwise I can do that in a separate PR and then tackle the migration.
I don't use it in production.
I will add DurationV1 and TimestampV1 in this PR with the previous encoding
cc @aaronc
@aaronc DurationV1Codec and TimestampV1Codec with the previous encoded/decoded version are added in this branch. Do you need any other help?