ygot icon indicating copy to clipboard operation
ygot copied to clipboard

perf(node): caching implementation for node operations

Open jayzhudev opened this issue 1 year ago • 4 comments

What

First of all, thank you for maintaining and sharing such an amazing and useful library, I hope we can give back to the community while making use of this project. This squashed commit was ported from our local fork after we made performance improvement changes for our use case. And my apologies if the size of the combined change is large for review. Please suggest any changes that could improve this patch, and ask for any clarifications.

Why

In one of our use cases of the ygot library, we found that there are significant performance benefits if we add a caching layer to the node calls. Most of the time spent on calling these methods, especially when the config tree is very large, is to perform tree traversal and find the nodes to operate on. This involves using relatively expensive calls to reflect methods, string comparisons and manipulations, and is mostly repetitive computation, if the configuration tree's structure and paths are relatively static but the values may change.

How (high-level)

The idea to implement the caching layer is to be able to use the unique node paths to locate a node immediately once its address is cached. This provides a shortcut to the tree-traversal method and brings in performance gains. In our use case, this feature addition along with the changes we made in other perf PRs brought the computation time down to <= 10ms from ~2000ms, under our benchmark tests with large config trees and high node modification stress.

There could be many other alternative implementations to achieve this performance gain, this is what we chose to implement and it worked well for our use case.

This optimization does come with a cost of complexity though, where the cache operations have to be managed properly. It also resulted in changes we had to make in reflect.go to make sure that the node's address doesn't change when its value is modified.

In addition, we hope to make sure that the ygot library is stateless, and there's no breaking change to the exported APIs after adding the caching mechanism. As a result, the applications that use the ytypes library maintain the cache(s) on their own, and they can optionally pass in a pointer of the node cache to the call option struct and get a performance boost.

Other Changes

JSON Marshaling Performance

When trying to squeeze the last bit of performance out, we decided to use go-json at a few places instead of encoding/json. There was a bit of hesitation in doing so because of adding a 3rd party dependency. But the performance boost was significant, especially on the hot paths.

Procedure Shortcuts and reflect Calls

At the very last bit, but not least, we try to optimize the use of reflect as much as possible, and early return from continuing unnecessary heap allocations (GC pressure at scale) and computations. Both of which could be considered refactoring with performance gains so no breaking change was introduced. Such code changes can be found, for example, in ygot/render.go/prependmodsJSON.

Summary of Change

  • Add caching for high performance Sets/Gets. The CPU usage and performance of GetNode/SetNode calls are significantly reduced after this change.

  • Improve JSON marshaling performance using go-json. This helps squeeze out the last bit of performance particularly for embedded devices.

  • Shortcut expensive heap allocations. The repetitive heap allocations may increase GC pressure and increase CPU usage when being called frequently.

Note: tests need to be added and completed for caching enabled method calls.

Performance

Here are some rough reference numbers from e2e testing sending 2000 leaf-list updates in a SetRequest:

Version Time
Before #830 ~2,300 ms
After #830 ~9 ms (warm cache)

Limitations

Currently, inserting a new element to a list uses reflect.Append in util/reflect.go. This changes the memory address of the slice head which results in node cache synchronization issues. Because of this, list and leaf-list types are not handled by the node cache for now.

jayzhudev avatar May 01 '23 23:05 jayzhudev

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

google-cla[bot] avatar May 01 '23 23:05 google-cla[bot]

Coverage Status

Coverage: 89.905% (-0.2%) from 90.122% when pulling 212fcc8cc51a5ea870850ea079c96b82dd9c050d on jayzhudev:perf-caching into 97fb2306cdc2a17e4894d13ee1390be40007e996 on openconfig:master.

coveralls avatar May 02 '23 21:05 coveralls

By the way thanks for posting this contribution! This looks like something that can benefit others and overall this implementation looks good.

wenovus avatar May 03 '23 00:05 wenovus

By the way thanks for posting this contribution! This looks like something that can benefit others and overall this implementation looks good.

Thanks for supporting this change! Let's get the other 3 PRs completed and I'll use some after work time to clean up this PR and add better tests 😄

jayzhudev avatar May 04 '23 03:05 jayzhudev