opa icon indicating copy to clipboard operation
opa copied to clipboard

perf: reduce allocations handling terms

Open anderseknert opened this issue 1 month ago โ€ข 6 comments

Reduce allocations in various methods working on ast.Term's:

  • 1 alloc less needed in Set.Copy(!)
  • 1 alloc less needed for Ref.Ptr
  • 2 allocs less needed in *object.MergeWith
  • Remove .Copy call in all the various ref prefix functions

The latter is likely the only "controversial" change, but reading the docstrings for these functions made no mention of returing deep copies, and importantly, we had no code that depended on them doing so. Meaning we paid a cost without reaping benefits from that. If future callers want copies, they can copy.

Also added relevant micro-benchmarks.

anderseknert avatar Dec 08 '25 21:12 anderseknert

Deploy Preview for openpolicyagent ready!

Name Link
Latest commit 6c6792d68d8d616464f54a29d25f949bb514f96e
Latest deploy log https://app.netlify.com/projects/openpolicyagent/deploys/69373f91cbeb440008124226
Deploy Preview https://deploy-preview-8116--openpolicyagent.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Dec 08 '25 21:12 netlify[bot]

Deploy Preview for openpolicyagent ready!

Name Link
Latest commit 9a869d85c90fa072939860d7e75e8b4e59840df2
Latest deploy log https://app.netlify.com/projects/openpolicyagent/deploys/6937400e3c27880008f5e0b3
Deploy Preview https://deploy-preview-8116--openpolicyagent.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Dec 08 '25 21:12 netlify[bot]

Great suggestion!

However, the method func (ref Ref) Ptr() (string, error) could be slightly improved, as the current implementation has a couple of issues:

  • Iterates twice: once for size calculation, once for writing
  • l += len(str) doesn't account for escaping. url.PathEscape can expand strings (e.g., space โ†’ %20, 1 char โ†’ 3 chars)
  • tail[i].Value.(String) checked in both loops

What I mean by point 2:

str := "hello world"  // len = 11
// After PathEscape: "hello%20world" // len = 13
// Buffer will be undersized and need to grow!

We could implement this method with a single loop iteration and correct buffer length calculation:

func (ref Ref) Ptr() (string, error) {
    tail := ref[1:]
    if len(tail) == 0 {
        return "", nil
    }

    // Estimate buffer size: original length + separators + ~30% for escaping
    estSize := 0
    for i := range tail {
        if str, ok := tail[i].Value.(String); ok {
            estSize += len(str)
        } else {
            return "", errors.New("invalid path value type")
        }
    }
    estSize += len(tail) - 1  // separators ('/')
    estSize += estSize / 3    // ~30% overhead for potential escaping

    var buf strings.Builder
    buf.Grow(estSize)

    for i := range tail {
        if i > 0 {
            buf.WriteByte('/')
        }
        str := string(tail[i].Value.(String)) // safe: already validated above
        buf.WriteString(url.PathEscape(str))
    }
    return buf.String(), nil
}

Or even integrate the escaping to avoid intermediate allocations from url.PathEscape:

const upperhex = "0123456789ABCDEF"

// shouldEscapePath checks if byte needs escaping in URL path component
// RFC 3986: unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
func shouldEscapePath(c byte) bool {
    if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
        return false
    }
    switch c {
    case '-', '_', '.', '~':
        return false
    }
    return true
}

func (ref Ref) Ptr() (string, error) {
    tail := ref[1:]
    if len(tail) == 0 {
        return "", nil
    }

    // Validate all terms and calculate size
    estSize := len(tail) - 1 // separators
    for i := range tail {
        str, ok := tail[i].Value.(String)
        if !ok {
            return "", errors.New("invalid path value type")
        }
        estSize += len(str) + len(str)/3 // string + ~30% escape overhead
    }

    var buf strings.Builder
    buf.Grow(estSize)

    for i := range tail {
        if i > 0 {
            buf.WriteByte('/')
        }
        s := string(tail[i].Value.(String))
        
        // Inline path escaping - avoids url.PathEscape allocations
        for j := 0; j < len(s); j++ {
            c := s[j]
            if shouldEscapePath(c) {
                buf.WriteByte('%')
                buf.WriteByte(upperhex[c>>4])
                buf.WriteByte(upperhex[c&15])
            } else {
                buf.WriteByte(c)
            }
        }
    }
    return buf.String(), nil
}

What do you think?

alex60217101990 avatar Dec 09 '25 06:12 alex60217101990

@alex60217101990 Hey! And thanks for your suggestion. I'm sorry though, but I'm not sure I understand your comment vs. your code. The code you posted also iterates twice over the parts, and I'm frankly not sure how any solution could correctly calculate the length to pre-allocate without it doing that.

As for the shouldEscapePath suggestion โ€” yes, that doesn't sound too bad to me, at least not if it's to be used in more locations than here. Ideally we'd have a TextAppender provided by the stdlib and just use that, but at least there's some insight into why there isn't one provided here: https://cs.opensource.google/go/go/+/refs/tags/go1.25.5:src/net/url/url.go;l=1240

I think an appender for URL path components in some util lib of ours (which relied on the logic from your example) would likely be a good option.

anderseknert avatar Dec 09 '25 11:12 anderseknert

@anderseknert Okay, I agree with you.

One small question, wouldn't it perhaps be better if the object were allocated on the stack rather than the heap ?

buf := &strings.Builder{}  // forces heap allocation

Better:

var buf strings.Builder  // stack allocation (if it doesn't escape)

Also, we could start the loop in the method (ref Ref) GroundPrefix() Ref from 1 immediately:

func (ref Ref) GroundPrefix() Ref {
    if ref.IsGround() {
        return ref
    }

    // Head is always ground by definition, start checking from index 1
    for i := 1; i < len(ref); i++ {
        if !ref[i].IsGround() {
            return ref[:i] 
        }
    }

    return ref
}

alex60217101990 avatar Dec 09 '25 13:12 alex60217101990

In addition to your comment.

I looked at the code via the link you attached, thank you. Looking at Go's stdlib url.escape() implementation (source) To correctly pre-allocate the buffer, two passes are necessary. This is the same approach used by Go's standard library.

As suggested, creating a reusable PathEscapeAppender in a util package would be valuable if this pattern is used in multiple places. Proposed utility for v1/util/escape.go:

package util

import "strings"

const upperhex = "0123456789ABCDEF"

// shouldEscapePathSegment checks if byte needs escaping in URL path segment
// Based on RFC 3986 and Go's url.shouldEscape(c, encodePathSegment)
func shouldEscapePathSegment(c byte) bool {
    // Unreserved characters (RFC 3986 ยง2.3)
    if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
        return false
    }
    switch c {
    case '-', '_', '.', '~':
        return false
    // Additional allowed in path segment
    case '!', '$', '&', '\'', '(', ')', '*', '+', '=', ':', '@':
        return false
    }
    return true
}

// AppendPathEscape appends the path-escaped form of s to buf.
// This avoids intermediate allocations compared to url.PathEscape.
func AppendPathEscape(buf *strings.Builder, s string) {
    for i := 0; i < len(s); i++ {
        c := s[i]
        if shouldEscapePathSegment(c) {
            buf.WriteByte('%')
            buf.WriteByte(upperhex[c>>4])
            buf.WriteByte(upperhex[c&15])
        } else {
            buf.WriteByte(c)
        }
    }
}

// CountPathEscapeLen returns the length of s after path escaping.
// Useful for pre-allocating buffers.
func CountPathEscapeLen(s string) int {
    n := 0
    for i := 0; i < len(s); i++ {
        if shouldEscapePathSegment(s[i]) {
            n += 3 // %XX
        } else {
            n++
        }
    }
    return n
}

And then the Ref.Ptr() method:

func (ref Ref) Ptr() (string, error) {
    tail := ref[1:]
    if len(tail) == 0 {
        return "", nil
    }

    // Calculate exact buffer size
    l := len(tail) - 1 // separators
    for i := range tail {
        str, ok := tail[i].Value.(String)
        if !ok {
            return "", errors.New("invalid path value type")
        }
        l += util.CountPathEscapeLen(string(str))
    }

    var buf strings.Builder
    buf.Grow(l)

    for i := range tail {
        if i > 0 {
            buf.WriteByte('/')
        }
        util.AppendPathEscape(&buf, string(tail[i].Value.(String)))
    }
    return buf.String(), nil
}

alex60217101990 avatar Dec 10 '25 07:12 alex60217101990