odata.net icon indicating copy to clipboard operation
odata.net copied to clipboard

Create optimized ODataPath cloning constructor

Open habbes opened this issue 3 years ago • 1 comments

Issues

*This pull request fixes #2488

Description

Briefly describe the changes of this pull request.

Checklist (Uncheck if it is not completed)

  • [ ] Test cases added
  • [ ] Build and test with one-click build and test script passed

Additional work necessary

If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.

habbes avatar Aug 19 '22 09:08 habbes

This PR has 10 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Extra Small
Size       : +10 -0
Percentile : 4%

Total files changed: 2

Change summary by file extension:
.cs : +10 -0

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a balance between between PR complexity and PR review overhead. PRs within the optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification) of this PR in relation to all other PRs within the repository.


Was this comment helpful? :thumbsup:  :ok_hand:  :thumbsdown: (Email) Customize PullRequestQuantifier for this repository.

Not sure I understand what you mean by this statement:

If I make it public, a different method overload will get called without the user having to opt-in to it or without the user having to change arguments.

Are you saying that if you make this constructor public that will result into existing user code automatically invoking it instead of the constructor that is being invoked presently?

gathogojr avatar Aug 25 '22 05:08 gathogojr

@habbes The following statement was confusing to me at first:

  1. Create a new List with the same capacity as the length of the existing list

"same capacity" is what threw me off (almost). I think you should say "Create a new List with the capacity equal to the length of the existing list"

gathogojr avatar Aug 25 '22 05:08 gathogojr

@habbes Under scenario 4, what does odataPath.Segments.Length represent on line 1 of the code snippet:

int capacity = odataPath.Segments.Length < odataPath.segments.Count ? odataPath.segments.Capacity : odataPathSegments.Capacity + 1;

Did you intend:

int capacity = odataPath.segments.Count < odataPath.segments.Capacity ? odataPath.segments.Capacity : odataPathSegments.Capacity + 1;

You also say that you based this PR on scenario 4 but the code in this PR is different

gathogojr avatar Aug 25 '22 05:08 gathogojr

@habbes Curious whether there are places in the existing code that could make use of this new constructor.

gathogojr avatar Aug 25 '22 05:08 gathogojr

@habbes I'm not sure how much we do equality checks for ODataPath but maybe we could also optimize by using a for loop in place of Where and Any [here]that (https://github.com/OData/odata.net/blob/4e263ca4b94d01f2d105d8282c62ec272cfcd1bb/src/Microsoft.OData.Core/UriParser/SemanticAst/ODataPath.cs#L152)

gathogojr avatar Aug 25 '22 06:08 gathogojr

Not sure I understand what you mean by this statement:

If I make it public, a different method overload will get called without the user having to opt-in to it or without the user having to change arguments.

Are you saying that if you make this constructor public that will result into existing user code automatically invoking it instead of the constructor that is being invoked presently?

Yes, that is precisely what I mean.

Currently we have a constructor with the existing signature:

public ODataPath(IEnumerable<ODataPathSegment>);

ODataPath also implements IEnumerable<ODataPathSegment>.

So we have code internally that create a copy of the ODataPath using the following pattern:

ODataPath oldPath;
var newPath = new ODataPath(oldPath);

Look at the AddSegment extension method for example. Currently this calls the constructor above.

This PR adds a new a constructor with the signature:

internal ODataPath(ODataPath);

Now the AddSegment method's new ODataPath(odataPath) will go to this new constructor instead of the one that takes an IEnumerable<ODataPath>. But External code will still go to the old IEnumerable<ODataPath> constructor because the new one is internal. But if I made it public, then even external code with the same pattern will start going to the new constructor.

Note that this change won't affect calls that pass a list of segments directly, or any other enumerable for the matter. It only "hijacks" interal calls that pass an ODataPath directly.

Let me know if that clears things up.

habbes avatar Aug 29 '22 04:08 habbes

@habbes Under scenario 4, what does odataPath.Segments.Length represent on line 1 of the code snippet:

int capacity = odataPath.Segments.Length < odataPath.segments.Count ? odataPath.segments.Capacity : odataPathSegments.Capacity + 1;

Did you intend:

int capacity = odataPath.segments.Count < odataPath.segments.Capacity ? odataPath.segments.Capacity : odataPathSegments.Capacity + 1;

You also say that you based this PR on scenario 4 but the code in this PR is different

Yes, you're right, that is what I meant:

int capacity = odataPath.segments.Count < odataPath.segments.Capacity ? odataPath.segments.Capacity : odataPathSegments.Capacity + 1;

Also you're right that this slightly differs from the actual code in the PR. I must have changed after writing the description. The initial thought was to set the same capacity as the input list if was already greater than the length. But I didn't notice any improvements in the benchmarks from doing that. The thought was that doing that may reduce resizing. But that benefit didn't materialize because only one segment is going to be added to this newly created list. The next segment after that is going to be added to a whole new ODataPath instance with a whole new list that's a whole new copy. So it didn't make sense to optimize beyond the immediate next segment to be added, otherwise we may just end up with a bunch of arrays that are larger than necessary. Let me update the description to match the code.

habbes avatar Aug 29 '22 04:08 habbes

@habbes Curious whether there are places in the existing code that could make use of this new constructor.

Yes, I have mentioned an example in a comment above: https://github.com/OData/odata.net/pull/2489#issuecomment-1229753241

habbes avatar Aug 29 '22 04:08 habbes

@habbes I'm not sure how much we do equality checks for ODataPath but maybe we could also optimize by using a for loop in place of Where and Any [here]that (

https://github.com/OData/odata.net/blob/4e263ca4b94d01f2d105d8282c62ec272cfcd1bb/src/Microsoft.OData.Core/UriParser/SemanticAst/ODataPath.cs#L152

)

Yeah, we could make that improvement. But I have seen a couple of other places that could benefit from using plain loops to iterate over the ODataPath. I think I could bundle them up in a separate PR, what do you think?

habbes avatar Aug 29 '22 04:08 habbes