perf: reduce allocations handling terms
Reduce allocations in various methods working on ast.Term's:
- 1 alloc less needed in
Set.Copy(!) - 1 alloc less needed for
Ref.Ptr - 2 allocs less needed in
*object.MergeWith - Remove
.Copycall in all the various ref prefix functions
The latter is likely the only "controversial" change, but reading the docstrings for these functions made no mention of returing deep copies, and importantly, we had no code that depended on them doing so. Meaning we paid a cost without reaping benefits from that. If future callers want copies, they can copy.
Also added relevant micro-benchmarks.
Deploy Preview for openpolicyagent ready!
| Name | Link |
|---|---|
| Latest commit | 6c6792d68d8d616464f54a29d25f949bb514f96e |
| Latest deploy log | https://app.netlify.com/projects/openpolicyagent/deploys/69373f91cbeb440008124226 |
| Deploy Preview | https://deploy-preview-8116--openpolicyagent.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Deploy Preview for openpolicyagent ready!
| Name | Link |
|---|---|
| Latest commit | 9a869d85c90fa072939860d7e75e8b4e59840df2 |
| Latest deploy log | https://app.netlify.com/projects/openpolicyagent/deploys/6937400e3c27880008f5e0b3 |
| Deploy Preview | https://deploy-preview-8116--openpolicyagent.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Great suggestion!
However, the method func (ref Ref) Ptr() (string, error) could be slightly improved, as the current implementation has a couple of issues:
- Iterates twice: once for size calculation, once for writing
- l += len(str) doesn't account for escaping. url.PathEscape can expand strings (e.g., space โ %20, 1 char โ 3 chars)
- tail[i].Value.(String) checked in both loops
What I mean by point 2:
str := "hello world" // len = 11
// After PathEscape: "hello%20world" // len = 13
// Buffer will be undersized and need to grow!
We could implement this method with a single loop iteration and correct buffer length calculation:
func (ref Ref) Ptr() (string, error) {
tail := ref[1:]
if len(tail) == 0 {
return "", nil
}
// Estimate buffer size: original length + separators + ~30% for escaping
estSize := 0
for i := range tail {
if str, ok := tail[i].Value.(String); ok {
estSize += len(str)
} else {
return "", errors.New("invalid path value type")
}
}
estSize += len(tail) - 1 // separators ('/')
estSize += estSize / 3 // ~30% overhead for potential escaping
var buf strings.Builder
buf.Grow(estSize)
for i := range tail {
if i > 0 {
buf.WriteByte('/')
}
str := string(tail[i].Value.(String)) // safe: already validated above
buf.WriteString(url.PathEscape(str))
}
return buf.String(), nil
}
Or even integrate the escaping to avoid intermediate allocations from url.PathEscape:
const upperhex = "0123456789ABCDEF"
// shouldEscapePath checks if byte needs escaping in URL path component
// RFC 3986: unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
func shouldEscapePath(c byte) bool {
if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
return false
}
switch c {
case '-', '_', '.', '~':
return false
}
return true
}
func (ref Ref) Ptr() (string, error) {
tail := ref[1:]
if len(tail) == 0 {
return "", nil
}
// Validate all terms and calculate size
estSize := len(tail) - 1 // separators
for i := range tail {
str, ok := tail[i].Value.(String)
if !ok {
return "", errors.New("invalid path value type")
}
estSize += len(str) + len(str)/3 // string + ~30% escape overhead
}
var buf strings.Builder
buf.Grow(estSize)
for i := range tail {
if i > 0 {
buf.WriteByte('/')
}
s := string(tail[i].Value.(String))
// Inline path escaping - avoids url.PathEscape allocations
for j := 0; j < len(s); j++ {
c := s[j]
if shouldEscapePath(c) {
buf.WriteByte('%')
buf.WriteByte(upperhex[c>>4])
buf.WriteByte(upperhex[c&15])
} else {
buf.WriteByte(c)
}
}
}
return buf.String(), nil
}
What do you think?
@alex60217101990 Hey! And thanks for your suggestion. I'm sorry though, but I'm not sure I understand your comment vs. your code. The code you posted also iterates twice over the parts, and I'm frankly not sure how any solution could correctly calculate the length to pre-allocate without it doing that.
As for the shouldEscapePath suggestion โ yes, that doesn't sound too bad to me, at least not if it's to be used in more locations than here. Ideally we'd have a TextAppender provided by the stdlib and just use that, but at least there's some insight into why there isn't one provided here:
https://cs.opensource.google/go/go/+/refs/tags/go1.25.5:src/net/url/url.go;l=1240
I think an appender for URL path components in some util lib of ours (which relied on the logic from your example) would likely be a good option.
@anderseknert Okay, I agree with you.
One small question, wouldn't it perhaps be better if the object were allocated on the stack rather than the heap ?
buf := &strings.Builder{} // forces heap allocation
Better:
var buf strings.Builder // stack allocation (if it doesn't escape)
Also, we could start the loop in the method (ref Ref) GroundPrefix() Ref from 1 immediately:
func (ref Ref) GroundPrefix() Ref {
if ref.IsGround() {
return ref
}
// Head is always ground by definition, start checking from index 1
for i := 1; i < len(ref); i++ {
if !ref[i].IsGround() {
return ref[:i]
}
}
return ref
}
In addition to your comment.
I looked at the code via the link you attached, thank you. Looking at Go's stdlib url.escape() implementation (source) To correctly pre-allocate the buffer, two passes are necessary. This is the same approach used by Go's standard library.
As suggested, creating a reusable PathEscapeAppender in a util package would be valuable if this pattern is used in multiple places.
Proposed utility for v1/util/escape.go:
package util
import "strings"
const upperhex = "0123456789ABCDEF"
// shouldEscapePathSegment checks if byte needs escaping in URL path segment
// Based on RFC 3986 and Go's url.shouldEscape(c, encodePathSegment)
func shouldEscapePathSegment(c byte) bool {
// Unreserved characters (RFC 3986 ยง2.3)
if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
return false
}
switch c {
case '-', '_', '.', '~':
return false
// Additional allowed in path segment
case '!', '$', '&', '\'', '(', ')', '*', '+', '=', ':', '@':
return false
}
return true
}
// AppendPathEscape appends the path-escaped form of s to buf.
// This avoids intermediate allocations compared to url.PathEscape.
func AppendPathEscape(buf *strings.Builder, s string) {
for i := 0; i < len(s); i++ {
c := s[i]
if shouldEscapePathSegment(c) {
buf.WriteByte('%')
buf.WriteByte(upperhex[c>>4])
buf.WriteByte(upperhex[c&15])
} else {
buf.WriteByte(c)
}
}
}
// CountPathEscapeLen returns the length of s after path escaping.
// Useful for pre-allocating buffers.
func CountPathEscapeLen(s string) int {
n := 0
for i := 0; i < len(s); i++ {
if shouldEscapePathSegment(s[i]) {
n += 3 // %XX
} else {
n++
}
}
return n
}
And then the Ref.Ptr() method:
func (ref Ref) Ptr() (string, error) {
tail := ref[1:]
if len(tail) == 0 {
return "", nil
}
// Calculate exact buffer size
l := len(tail) - 1 // separators
for i := range tail {
str, ok := tail[i].Value.(String)
if !ok {
return "", errors.New("invalid path value type")
}
l += util.CountPathEscapeLen(string(str))
}
var buf strings.Builder
buf.Grow(l)
for i := range tail {
if i > 0 {
buf.WriteByte('/')
}
util.AppendPathEscape(&buf, string(tail[i].Value.(String)))
}
return buf.String(), nil
}