opa Decision Logs/Status API payload transformation

What part of OPA would you like to see improved?

Decision Logs and Status API - specifically, the payload formats. Going forward, being able to alter the JSON format of these two events might be useful when integrating with certain external systems. Some services require a certain payload format in order to store/process data and as of today OPA doesn't have a way of customizing decision logs and status API object formats. A good example would be integrating with Google Cloud Pub/Sub, which requires a payload as such:

{
    "messages": [
        {
            "data" : "base64_encoded_payload", <-- could be the original OPA payload as base64
            "ordering_key": "some_ordering_key",
            "attributes" : {"somekey" : "somevalue"}
        }
    ]
}

Examples below are for a GCP Pub/Sub Status API payload.

Describe the ideal solution

A simple to use solution would be an additional config, which specifies the payload format type. It could be service-centric, allowing values such as "gcpPubSub", "opa" (default). OPA would then automatically apply the according payload transformation function, before sending out the POST requests.

Describe a "Good Enough" solution

Another way of achieving this, although harder to spec, would be relying on the user to provide the format while using placeholders and/or functions. Config.yaml or even rego could be used for this purpose. Example config yaml:

payloadFormat: |
        {
            "messages": [
                {
                    "data": "base64($opaPayload)",
                    "orderingKey": "some_key",
                    "attributes": {"id": "$agentId"}
                }
            ]
        }

For REGO, we could imagine a policy that will receive the payload a an input string and would have to return the modified JSON object. Example REGO:

package jsontransform

payload = result {
    result := {
    	"messages": {
        	"data": base64.encode(input.payload),
                "orderingKey": "some_ordering_key",
                "attributes": {
         	    "id": opa.runtime().config.labels.id
            }
        }
    }
}

This policy file would also have to be referenced into the config. Also, for decision logs, maybe input.payload would have to be already base64, as it's gzipped by default?

Jun 20 '22 10:06 alin-florin-ciu-db

I like the package jsontransform idea there. Using Rego wherever we can is appreciated 😎 That said, I wonder if the configurability is really needed. If there's only our current format, GCP, and AWS (I'm sure there's something AWS-y here, too), we could hardcode those.

That said, I believe the previous approach there was creating custom plugins. Is that a viable approach for you?

Jun 20 '22 12:06 srenatus

Having a set of hardcoded formats could be the easiest way of plugging in this functionality, I agree. However, the solution based on REGO offers greater flexibility, and I'm mainly thinking about these 2 points:

it would allow integration with any format out there, not just the PubSub/SQS/etc formats that maybe would be changed in the future, as well.
it allows for plugging in extra data in said payloads, see my opa.runtime().config.label line in the example.
allows for decisions at payload-building time or even enriching existing payloads (thinking Status API here)

One potential drawback of the REGO solution: performance impact (but I think if logic is restricted only to string/objects manipulation it should be OK).

Custom plugins can't be dynamically loaded into OPA AFAIK, users would have to compile and maintain their own flavour of the agent in order to make use of it (and maybe even grasp Go in the process). I was thinking more along the lines of a general, highly configurable solution based on a core OPA functionality - the REGO language.

Thinking about it, I think REGO could be the "best solution" to this issue, but some feedback would be appreciated.

Jun 20 '22 12:06 alin-florin-ciu-db

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

Jul 20 '22 22:07 stale[bot]

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

Apr 07 '23 09:04 stale[bot]