Now, nanoseconds are properly encoded/decoded, and the lexicographical order of the bytes is correct when comparing two Duration objects: one with negative nanoseconds and another with zero nanoseconds.

Fixed: #19910

Description

Closes: #19910

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.

I have...

[x] included the correct type prefix in the PR title
[x] confirmed ! in the type prefix if API or client breaking change
[x] targeted the correct branch (see PR Targeting)
[x] provided a link to the relevant issue or specification
[x] reviewed "Files changed" and left comments if necessary
[x] included the necessary unit and integration tests
[x] added a changelog entry to CHANGELOG.md
[x] updated the relevant documentation or specification, including comments for documenting Go code
[x] confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.

I have...

[ ] confirmed the correct type prefix in the PR title
[ ] confirmed all author checklist items have been addressed
[ ] reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

Bug Fixes
- Improved consistency in comparing duration objects by adjusting the encoding of zero, negative, and positive nanoseconds.

Mar 31 '24 12:03 EmilGeorgiev

Walkthrough

The recent changes in the orm module focus on resolving issues related to the encoding and decoding of nanoseconds in duration objects. By implementing a consistent approach of adding 999,999,999 nanoseconds during encoding, the update ensures accurate comparison of durations, maintains lexicographical order, and addresses discrepancies between direct comparisons and byte comparisons.

Changes

Files	Change Summary
`orm/CHANGELOG.md`, `orm/encoding/ormfield/duration.go`	Adjusted nanosecond encoding for consistent duration comparisons, ensuring non-negative values and lexicographical order.
`orm/encoding/ormfield/timestamp.go`	Updated encoding and decoding functions for timestamps to handle edge cases and maintain order in nanoseconds.

Assessment against linked issues

Objective (Issue #)	Addressed	Explanation
Correct encoding/decoding of nanoseconds in durations (#19910)	✅
Consistent comparison of durations with zero, negative, and positive nanoseconds (#19910)	✅
Updating test cases to validate the changes (#19910)	✅

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

Mar 31 '24 12:03 coderabbitai[bot]

@aaronc if my PR or description are not clear and you have any questions please ask me and I will try to explain better.

Apr 10 '24 15:04 EmilGeorgiev

Thanks @EmilGeorgiev. What would be helpful is if you could write out what the encoding format is now (this should be documented anyway) and also how that differs from what was there before, maybe with a couple of specific failure values. I know you've included some of this, but I spent some time looking at this yesterday and for some reason it's still a bit fuzzy.

Apr 10 '24 16:04 aaronc

First let's start from the beginning.

How I found the bug - I saw that we have failing test cases "TestCodec/dur_false" and "TestCodec/dur_false" and I started to investigate why they are failing.
What TestCodec contains several test cases which are testing whether encoding and decoding of different types ( uint32, uint64, string, int32, int64, bool, timestamp, duration) are working correctly and that the byte representation of every type is lexicographically correct compared to the lexicographically order of bytes of another slice of bytes of the sam type. For example when we compare the values or the byte representation of these values which has the some type 'A' < 'B', 10 < 20, Timestamp{second: 10, nanos: 10} < Timestamp{second: 40, nanos: 40}, and so on. If the first value is less then the second one then this means when we convert the values to bytes again the first byte slice will be lexhicorfically less then the second one
Why the test cases for Duration failed? - The problem is that there are some edge cases in which durationA < durationB but when we encode the duration values in bytes then bytesOfDurationA > bytesOfDurationB. For example Duration{Seconds: -5, nanos: -1} < Duration{Seconds: -5, nanos: -1}, but when we encode these to values the bytes of the first one will be greater than hthe bytes of the second one which is not correct. The problem is that in the current implementation when nanos are negative number we set this value for the nanos: nanosInt = DurationNanosMax + nanosInt + 1 ( which I think is very cleaver and interesting decision to solve the problem when the naps are negative ), if the nanosInt is -1 the value for the nanos after this will be nanosInt = 999999999 +(-1) +1 = 999999999. In the same time all nanos that are equal to zero are untouched/unchanged. After that we have nanos value of -1 for the first duration and 0 nanos for the second duration at the end we will 999999999 for the first duration and 0 for the second. 999999999 > 0 and this will lead to different result when we compare two duration types with DurationCodec.Comapre and bytes.Compare(bytes(dur1)m, bytes(dur2)).
How we can solve the problem - one possible solution is treat/manipulate nanos in same way independent whether they are negative, zero or positive. I always add 999999999 to the nanos and we guarantee that after that the nanos will be always >= 0. For example we have -1 nanos in the first duration and 0 nanos in the second duration ( -1 < 0 ) when we add 999999999 the values will be 999999998 nanos for the first duration and 999999999 nanos for the second duration. We keep the lexicographically order of the byte representation of these values ( 999999998 < 999999999 ) .
How the encoding format looks like in the PR and what is the difference with the current encoding format: First the encoding for the seconds is untouched, only encoding for the nanos is changed. In the current implementation we have one encoding format when nanos are 0 and another when they are > 0: When nanos are 0 then the encoding format is simple, we represent this with one byte : . When nanos are 0 In the PR we add 999999999 to the nanos and the encoding format for the nanos are represented with 4 bytes [0x0] - current encoding format when nanos are zero [0xBB, 0x9A, 0xC9, 0xFF] - encoded format in the PR when nanos are zero

When nanos > 0, for example 1: [0xC0, 0x0, 0x0, 0x1] - the current encoding format [0xBB, 0x9A, 0xCA, 0x00] - the encoding format in the PR. Here, when nanos are equal to 1 In the PR we add 999999999 to the nanos and the value become 1000000000.

Something that we should mention is that in the current implementation we make OR operation to the first byte with 0xC0 to be sure that the first 2 digits of the first byte will be always 11 (11xxxxxx) . In the PR changed 0xC0 with 0x80 and now we are sure that only the first digit of the byte is 1 (1xxxxxxx). I did this because if the nanos are 999999999 we will add to them 999999999 in the Encode method. The new number will be 1999999998 with in binary representation is ( 01110111001101011001001111111110 ) . We need at least 31 digits to represent this number. In contrast in the current implementation we need only 30 digits because the number 999999999 in binary is ( 00111011100110101100100111111111 ) the change from 0xC0 to 0x80 was need to be sure that when we encode/decode the maximum possible number 1999999998, we will get always the same result and we still keep the logic that check whether the bytes contains zero or not which is necessary because encoding and decoding function are used and for timestamp where when nanos are zero the byte representation will be still [0x0].

Conclusion This change simplifies the encoding logic and ensures that the lexicographical order of durations is preserved in their byte representations, addressing the edge cases effectively.

I hope that explain it better now. Sorry for the confusion in the beginning. cc @aaronc

Apr 11 '24 06:04 EmilGeorgiev

ACK. Makes sense now. Thanks @EmilGeorgiev and apologies for my delay in doing a complete review. This is simpler than the previous implementation and doesn't seem to have any downside in terms of the number of bytes encoded.

The biggest concern with deploying this if someone is using the previous version of this encoding and needs to upgrade. I'm not sure if anyone is, but it's possible. Can you make it clear in the CHANGELOG that there's a breaking change please and that users will need to take care when upgrading and possibly perform a migration?

Thanks @aaronc . Your implementation was very good. I just get your idea and improving it a little bit.

I will mark this as breaking change

May 13 '24 13:05 EmilGeorgiev

I think the one thing we might need here is a migration path for users of the previous encoding version. @EmilGeorgiev are you using the previous version in production?

Even if we don't know of production users though, we should probably lay the grounds for a migration path as a precaution. The first thing that involves is making sure the previous duration and timestamp codecs are available as DurationV1 and TimestampV1. We are using DurationV0 and TimestampV0 in production and will want to upgrade directly to the versions in this PR (which we should consider V2), so I intend to work on some migration code soon either way.

@EmilGeorgiev would it be easy for you to add DurationV1 and TimestampV1 to the PR with the previous encoding? Otherwise I can do that in a separate PR and then tackle the migration.

May 21 '24 20:05 aaronc

I think the one thing we might need here is a migration path for users of the previous encoding version. @EmilGeorgiev are you using the previous version in production?

Even if we don't know of production users though, we should probably lay the grounds for a migration path as a precaution. The first thing that involves is making sure the previous duration and timestamp codecs are available as DurationV1 and TimestampV1. We are using DurationV0 and TimestampV0 in production and will want to upgrade directly to the versions in this PR (which we should consider V2), so I intend to work on some migration code soon either way.

@EmilGeorgiev would it be easy for you to add DurationV1 and TimestampV1 to the PR with the previous encoding? Otherwise I can do that in a separate PR and then tackle the migration.

I don't use it in production.

I will add DurationV1 and TimestampV1 in this PR with the previous encoding

cc @aaronc

May 22 '24 14:05 EmilGeorgiev

@aaronc DurationV1Codec and TimestampV1Codec with the previous encoded/decoded version are added in this branch. Do you need any other help?

May 22 '24 15:05 EmilGeorgiev

cosmos-sdk
cosmos-sdk copied to clipboard

fix(orm)!: properly encode/decode nanoseconds in Duration type

Description

Author Checklist

Reviewers Checklist

Summary by CodeRabbit

Walkthrough

Changes

Assessment against linked issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

Documentation and Community

cosmos-sdk cosmos-sdk copied to clipboard

fix(orm)!: properly encode/decode nanoseconds in Duration type

Description

Author Checklist

Reviewers Checklist

Summary by CodeRabbit

Walkthrough

Changes

Assessment against linked issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

cosmos-sdk
cosmos-sdk copied to clipboard

CodeRabbit Configration File (`.coderabbit.yaml`)