opa icon indicating copy to clipboard operation
opa copied to clipboard

`opa fmt` changes `\u0000` to `\x00` which leads to parser error

Open zregvart opened this issue 2 years ago • 4 comments

Short description

opa fmt will reformat this Rego (playground):

package play

x := "\u0000"

as

package play

x := "\x00"

Which causes rego_parse_error: illegal escape sequence.

Steps To Reproduce

  1. run opa eval 'nul := "\u0000"' see the result and nul bound to "\u0000"
  2. run echo 'package test\nnul := "\\u0000"' | opa fmt to see the conversion to nul := "\x00"
  3. opa eval 'nul := "\x00"' to see the rego_parse_error: illegal escape sequence error (in JSON)

Expected behavior

Formatting the code using opa fmt should not introduce parser errors. Not sure if opa fmt needs to be changed or the parser needs to allow \x... escape sequences.

Additional context

$ opa version
Version: 0.53.1
Build Commit: 947d65b372eebfde65686028a0c43b30dd7f04b2
Build Timestamp: 2023-06-06T07:12:49Z
Build Hostname: e0cbbd3ede6c
Go Version: go1.20.4
Platform: linux/amd64
WebAssembly: available

zregvart avatar Sep 08 '23 15:09 zregvart

Hi @zregvart, and thanks for reporting this 👍

Looks like the formatter calls the String() method on the term, which delegates to strconv.Quote and that's where the string is transformed into the "illegal" entity \x00. I'm honestly not sure about what the correct thing to do would be. I can't discern from the docs whether control characters in strings are supposed to be allowed or not 😅 So I guess whatever the outcome here is, it'd be good to add to the docs on strings what is supported and what isn't. And if they aren't allowed, the formatter should obviously be fixed to flag them.

anderseknert avatar Sep 11 '23 09:09 anderseknert

Just noting my workaround which is to use nul := base64.decode("AA=="). I don't think many folk will run into this issue. I needed it for calculating a digest of a specific format that has \0 as delimiter.

zregvart avatar Sep 11 '23 09:09 zregvart

Yep, I think any character that's valid JSON should be allowed, and as far as I know, using \u0000 to represent unicode null is valid, so the formatter should leave it be... but perhaps there are reasons for this behavior I'm not aware of, and somebody else could chip in to provide some background on that.

anderseknert avatar Sep 11 '23 10:09 anderseknert

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

stale[bot] avatar Oct 11 '23 11:10 stale[bot]