smithy
smithy copied to clipboard
Unused offset within smithy.model.pattern.SmithyPattern.Segment.Parse
Hi, I think there's a small bug with the Segment Parser.
It appears that the offset
variable is currently unused within the smithy.model.pattern.SmithyPattern.Segment.Parse
function.
https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/main/java/software/amazon/smithy/model/pattern/SmithyPattern.java#L276
All callers of this function also appear to be calling this parse
function incorrectly and are exhibiting correct parsing behaviour because the offset
variable is unused.
The call to Segment.parse
at https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/main/java/software/amazon/smithy/model/pattern/UriPattern.java#L80
Should always have an offset of 0 because the string is first split into chunks by /
.
The call to Segment.parse
at https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/test/java/software/amazon/smithy/model/pattern/SmithyPatternTest.java#L39
Same as above, should always be 0 given that the string is chunked by '/' first.
The call to Segment.parse
at https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/main/java/software/amazon/smithy/model/traits/EndpointTrait.java#L66
This should also have the offset always set to 0 because the string is first tokenized into chunks.
A new test should be added to verify that the offset is used correctly.
Alternatively, the offset
parameter could be removed but this would be a breaking API change so less desirable, up to you though.
I'm happy to submit a PR to correct the issue if the above is correct.
You're right that the offset
parameter to SmithyPattern.Segment.parse
isn't used. I think the documentation for the parameter is a little misleading, causing your interpretation that it should always be given 0. The documentation currently reads "Character offset where the segment starts." This should actually be "Character offset where the segment starts in the containing pattern."
With that in mind, the initial offset
values make more sense but the calculations are still incorrect - they don't take in to account the separator characters when adding the length to the offset. See here:
https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/main/java/software/amazon/smithy/model/pattern/UriPattern.java#L76-L82
and here:
https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/main/java/software/amazon/smithy/model/traits/EndpointTrait.java#L67
Instead of deprecating the parse
method with offset, we could make use of it by safely overloading the Segment
constructor and utilizing it in the exception messages thrown from checkForInvalidContents
:
https://github.com/awslabs/smithy/blob/2e1cdc4af7e7cdab3138f5f3912a3af930b569a2/smithy-model/src/main/java/software/amazon/smithy/model/pattern/SmithyPattern.java#L252
We'll discuss this and make the appropriate update(s).