opa icon indicating copy to clipboard operation
opa copied to clipboard

Panics when opa running as a server

Open alam-chime opened this issue 1 year ago • 2 comments

Short description

We're running opa server as a sidecar in kubernetes. At the time of the issue, both memory and cpu usage were well below the defined limits. For the majority of requests, we are receiving the expected outcomes. But there are a few instances where we're seeing HTTP 502s and 504s from opa. There are no differences between the inputs of the failing requests and those that succeed.

  • OPA version: 0.66.0, but we're seeing this behavior with 0.69.0 also
  • Policy bundle created using
                "opa", 
                "build",
		"--bundle", "./source",
		"--optimize", "2",
		"--output", "policy-bundle.tar.gz",
  • Server commands
                "opa",
		"run",
		"--server",
		"--addr", "localhost:8181",
		"--diagnostic-addr", ":8282",
		"--log-level", "error",
		"--config-file", "/config.yaml",
		"--authorization=basic",
		"--bundle", "policy-bundle.tar.gz",
  • Server error logs
2024/10/12 15:21:01 http: panic serving 127.0.0.1:46936: runtime error: invalid memory address or nil pointer dereference
goroutine 415141 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0x2102610?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000ed5cb0})
/src/internal/deepcopy/deepcopy.go:19 +0x173
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc0007529f0})
/src/internal/deepcopy/deepcopy.go:19 +0x14e
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000b0b230})
/src/internal/deepcopy/deepcopy.go:19 +0x14e
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000b0b1d0})
/src/internal/deepcopy/deepcopy.go:19 +0x14e
github.com/open-policy-agent/opa/plugins/logs.maskRuleSet.Mask({0xc00096ab20, {0xc0007284e0, 0x3, 0x4}, 0x0}, 0xc001136668)
/src/plugins/logs/mask.go:339 +0x11d
github.com/open-policy-agent/opa/plugins/logs.(*Plugin).maskEvent(0xc0001d1ce0, {0x1804978, 0xc000de66f0}, {0x17fa720, 0xc000de6ae0}, {0x1807280, 0xc001250000}, 0xc001136668)
/src/plugins/logs/plugin.go:1039 +0x25b
github.com/open-policy-agent/opa/plugins/logs.(*Plugin).Log(0xc0001d1ce0, {0x1804978, 0xc000de66f0}, 0xc000158c60)
/src/plugins/logs/plugin.go:704 +0x4b8
github.com/open-policy-agent/opa/runtime.(*Runtime).decisionLogger(0x17f4710?, {0x1804978, 0xc000de66f0}, 0xc000158c60)
/src/runtime/runtime.go:789 +0x6a
github.com/open-policy-agent/opa/server.decisionLogger.Log({0xc000de6b40?, {0x0?, 0xf?}, 0xc0002e1510?}, {0x1804978, 0xc000de66f0}, {0x17fa720, 0xc000de6ae0}, {0xc00036803e, 0xd}, ...)
/src/server/server.go:2992 +0x7b0
github.com/open-policy-agent/opa/server.(*Server).v1DataPost(0xc0001cc6c8, {0x1802e40, 0xc0001ecbd0}, 0xc000ba30e0)
/src/server/server.go:1796 +0xf0f
net/http.HandlerFunc.ServeHTTP(0x0?, {0x1802e40?, 0xc0001ecbd0?}, 0xc000def1c0?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/internal/prometheus.(*Provider).InstrumentHandler.func1({0x7f21a71208a0, 0xc000de6600}, 0xc000ba30e0)
/src/internal/prometheus/prometheus.go:89 +0x136
net/http.HandlerFunc.ServeHTTP(0x1802e40?, {0x7f21a71208a0?, 0xc000de6600?}, 0xc000def2c0?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func1({0x1802e40, 0xc0001ec018}, 0xc000ba30e0)
/src/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:97 +0xfd
net/http.HandlerFunc.ServeHTTP(0xc000ba2fc0?, {0x1802e40?, 0xc0001ec018?}, 0x0?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000000000, {0x1802e40, 0xc0001ec018}, 0xc000ba2480)
/src/vendor/github.com/gorilla/mux/mux.go:212 +0x1e2
github.com/open-policy-agent/opa/server/authorizer.(*Basic).ServeHTTP(0xc000789620, {0x1802e40, 0xc0001ec018}, 0xc0001ec018?)
/src/server/authorizer/authorizer.go:129 +0x5a4
net/http.HandlerFunc.ServeHTTP(0x0?, {0x1802e40?, 0xc0001ec018?}, 0xc0006a3690?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/internal/prometheus.(*Provider).InstrumentHandler.func1({0x7f21a71208a0, 0xc000b941e0}, 0xc000ba2000)
/src/internal/prometheus/prometheus.go:89 +0x136
net/http.HandlerFunc.ServeHTTP(0x1802150?, {0x7f21a71208a0?, 0xc000b941e0?}, 0x41af91?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func1({0x1802150, 0xc000b9a0a0}, 0xc000ba2000)
/src/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:97 +0xfd
net/http.HandlerFunc.ServeHTTP(0xc000b94150?, {0x1802150?, 0xc000b9a0a0?}, 0x12de5b5?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/server.(*Server).initHandlerCompression.CompressHandler.func1({0x1802150, 0xc000b9a0a0}, 0xc000ba2000)
/src/server/handlers/compress.go:41 +0x175
net/http.HandlerFunc.ServeHTTP(0xc0006a3ad8?, {0x1802150?, 0xc000b9a0a0?}, 0x11?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/runtime.(*LoggingHandler).ServeHTTP(0xc0005ebd10, {0x1801be0, 0xc00078e0e0}, 0xc000ba2000)
/src/runtime/logging.go:116 +0xad2
net/http.serverHandler.ServeHTTP({0x17fe1f0?}, {0x1801be0?, 0xc00078e0e0?}, 0x6?)
/usr/local/go/src/net/http/server.go:3137 +0x8e
net/http.(*conn).serve(0xc000b96000, {0x1804978, 0xc0003ae0f0})
/usr/local/go/src/net/http/server.go:2039 +0x5e8
created by net/http.(*Server).Serve in goroutine 111
/usr/local/go/src/net/http/server.go:3285 +0x4b4

Steps To Reproduce

We haven't been able to reproduce this issue locally, but we'll provide an update if we're successful.

Expected behavior

there shouldn't be a panic and opa server should respond back with the decision.

Additional context

This issue happens randomly, with no difference in the input between the requests that panic and the ones that succeed. The policy and data files are too big to share here, but I can create a smaller example if needed. Maybe the error logs are helpful for now?

alam-chime avatar Oct 12 '24 17:10 alam-chime

Hey, and thanks for filing this issue! Those error logs are helpful indeed. A few of us looked into this briefly today, and tbh, this is quite a mystery 🕵️ There's nothing obvious on those lines that should cause a nil dereference under any normal circumstances... and we've not yet managed to come up with even exceptional circumstance that provably cause one. If there's some obvious case we've overlooked, I'd be happy to hear about it!

anderseknert avatar Oct 14 '24 18:10 anderseknert

thanks @anderseknert for taking a look. Do you have any suggestions for debugging this issue? I'm also adding more error logs related to the panics we're seeing. I'll get back with a small example for our setup soon.

2024/10/11 09:12:51 http: panic serving 127.0.0.1:37098: runtime error: makeslice: len out of range
goroutine 950702 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0x17f50f0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x114a0e0?, 0x17f50f0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.mapEncoder.encode({0x100000002?}, 0xc000dc8240, {0x1135cc0?, 0x666390?, 0x1135cc0?}, {0x5f?, 0x85?})


2024/10/12 16:27:42 http: panic serving 127.0.0.1:41396: can't call pointer on a non-pointer Value
goroutine 623239 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x10cf800?, 0x17f6460?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x10cf800?, 0x17f6460?})
/usr/local/go/src/runtime/panic.go:770 +0x132
reflect.Value.pointer(...)
/usr/local/go/src/reflect/value.go:110
reflect.Value.lenNonSlice({0x1135cc0?, 0xc000b90cf0?, 0x41a205?})


2024/10/13 22:14:10 http: panic serving 127.0.0.1:52366: reflect: call of reflect.Value.IsNil on zero Value
goroutine 1459006 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x112a800?, 0xc000879c68?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x112a800?, 0xc000879c68?})
/usr/local/go/src/runtime/panic.go:770 +0x132
reflect.Value.IsNil(...)
/usr/local/go/src/reflect/value.go:1574
encoding/json.interfaceEncoder(0xc0005c1080, {0x1115f80?, 0xc000682f40?, 0x43?}, {0xe0?, 0x5f?})
/usr/local/go/src/encoding/json/encode.go:654 +0x110


2024/10/13 14:40:25 http: panic serving 127.0.0.1:35712: unexpected map key type
goroutine 1730644 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x10cf800?, 0x17f5430?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x10cf800?, 0x17f5430?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.resolveKeyName({0x10cf800?, 0xc000ac6210?, 0xc000ac0660?})


2024/10/14 20:20:17 http: panic serving 127.0.0.1:41322: runtime error: hash of unhashable type ast.String
goroutine 586544 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0xc0011b1070?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/topdown.(*baseCache).Put(...)
/src/topdown/cache.go:126
github.com/open-policy-agent/opa/topdown.(*eval).resolveReadFromStorage(0xc0011cc400, {0xc000578828, 0x3, 0x3}, {0x0, 0x0})
/src/topdown/eval.go:1641 +0x58f
github.com/open-policy-agent/opa/topdown.(*evalResolver).Resolve(0xc0011d3ec0, {0xc000578828, 0x3, 0x3})
/src/topdown/eval.go:1563 +0x657
github.com/open-policy-agent/opa/topdown.(*eval).Resolve(...)
/src/topdown/eval.go:1480


2024/10/14 19:43:45 http: panic serving 127.0.0.1:42696: reflect: call of reflect.Value.Bool on zero Value
goroutine 1640899 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x112a800?, 0xc0010d2ae0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x112a800?, 0xc0010d2ae0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
reflect.flag.mustBe(...)
/usr/local/go/src/reflect/value.go:233
reflect.Value.panicNotBool({0x10cfc00?, 0x2025768?, 0x532965?})
/usr/local/go/src/reflect/value.go:302 +0x7e
reflect.Value.Bool(...)
/usr/local/go/src/reflect/value.go:296


2024/10/14 18:57:46 http: panic serving 127.0.0.1:47100: runtime error: invalid memory address or nil pointer dereference
goroutine 227179 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0x2102610?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000742870})
/src/internal/deepcopy/deepcopy.go:19 +0x173
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc001165980})


2024/10/12 13:12:22 http: panic serving 127.0.0.1:37580: illegal value: ast.String
goroutine 485802 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x10cf800?, 0xc0003a8310?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/ast.sortOrder({0x11d3b40?, 0xc00039fae0?})
/src/ast/compare.go:302 +0x3a5
github.com/open-policy-agent/opa/ast.Compare({0x11d3b40?, 0xc00039fbb0?}, {0x11d3b40?, 0xc00039fae0?})
/src/ast/compare.go:68 +0x10e
github.com/open-policy-agent/opa/ast.objectElemSlice.Less(...)
/src/ast/term.go:2087

alam-chime avatar Oct 15 '24 19:10 alam-chime

I'll get back with a small example for our setup soon.

@alam-chime Did you happen to put together the example? If you're able to share any additional steps or information about your OPA config, it could be helpful for debugging. :slightly_smiling_face:

philipaconrad avatar Oct 23 '24 21:10 philipaconrad

sorry @philipaconrad I missed this message. Will share an example this week.

alam-chime avatar Nov 13 '24 20:11 alam-chime

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

stale[bot] avatar Dec 14 '24 06:12 stale[bot]

@alam-chime let us know in case this never got resolved! Since there hasn't been any activity here since November, I'll close the issue for now. Will re-open if needed!

anderseknert avatar Jun 09 '25 13:06 anderseknert