opa Add support to use print with a log level of choice.

Add support to use print with a log level of choice.

Open TweeZz opened this issue 2 years ago • 12 comments

What part of OPA would you like to see improved?

Currently, using 'print' uses the log 'info' log level. We would like to be able to specify at which level a print statement should log.

Describe the ideal solution

I'm fairly new to opa. Therefore it's a bit hard to suggest how this could look like. My intuitive suggestion would be: print.info("This is an info log entry") print.error("This is an error log entry") print.debug("This is an debug log entry")

Describe a "Good Enough" solution

A good enough solution to me would be whatever syntax you come up with to allow us to print on any level of choice :)

May 06 '22 19:05 TweeZz

Hi @TweeZz 👋 And thanks for submitting this.

That the print function uses the logging system when OPA is started as a server should not be taken to mean that the print function should be used as a logger. print is solely meant to be a tool to help with debugging, and if you run OPA at --log-level error — which is the recommended level for production — print statements are even erased from the policy at the time they are loaded into the runtime, so that they'll have no impact on performance.

For production, you'll probably want to look into decision logging.

Some background on the print function and its implementation can be found in the OPA blog.

May 09 '22 09:05 anderseknert

Hi..

Thanks for your feedback. We already have local decision logs enabled. On all our environments, we can dig into those. Yes, error is the recommended level to use on production. But there are situations where you need to (temporarily) change that to 'info' or even to 'debug' in order to investigate an issue on production which for whatever reason, you cannot simulate on another environment. Using print somewhere in the code adds an entry in the decision logs like so: {"decision-id":"5aca73ce-a437-4dec-8a24-e287f7566f69","level":"info","msg":"./policies.rego:218: some useful log entry here","time":"2022-05-12T08:15:29Z"}

I have read the decision logging page you referred to, but did not see how we could create a 'custom' decision log entry with for example a 'debug' level. Is there currently no way to create such an entry? Maybe I overlooked something :)

With any other programming language I have used (C++, C#, java, nodejs), there are means to add log entries with different log levels. We feel this as a missing feature..

TweeZz.

ps: I'm in Stockholm at the moment for the AWS Summit :)

May 12 '22 08:05 TweeZz

Hey again! 😃

Using print somewhere in the code adds an entry in the decision logs like so: {"decision-id":"5aca73ce-a437-4dec-8a24-e287f7566f69","level":"info","msg":"./policies.rego:218: some useful log entry here","time":"2022-05-12T08:15:29Z"}

That is not a decision log entry, but the standard logger output from a print call. A decision log entry provides:

The full input provided for query evaluation.
The result.
Revision of the bundles loaded into OPA (if provided in the bundles).
Some other metadata.

Provided you have built your bundles with a revision (commonly a git commit hash, but could be anything) that will let you identify what policy/data was loaded into OPA at the time of the decision, you should be able to replay the decision later with the same input/policy, and debug in any environment using the usual tooling.

Yes, error is the recommended level to use on production. But there are situations where you need to (temporarily) change that to 'info' or even to 'debug' in order to investigate an issue on production which for whatever reason, you cannot simulate on another environment.

Unless your policies rely heavily on http.send, or environment variables from opa.runtime(), the environment should have no impact on the decisions made.

With any other programming language I have used (C++, C#, java, nodejs), there are means to add log entries with different log levels. We feel this as a missing feature..

Right, but Rego isn't a general purpose programming language, but a purpose-built policy language. If you compare to other query languages you'll likely see many of the same characteristics there.

All that said — if there's a strong case for logging something that's not easily covered by the decision log module, we obviously should look into how to improve that.

ps: I'm in Stockholm at the moment for the AWS Summit :)

Oh, nice! Are you still around tomorrow? Feel free to drop by our office if you'd like to come by and say hi, for lunch or a beer in the afternoon. Shoot me an email at anders [at] styra.com if so :)

May 12 '22 19:05 anderseknert

Hey..

Provided you have built your bundles with a revision (commonly a git commit hash, but could be anything) that will let you identify what policy/data was loaded into OPA at the time of the decision, you should be able to replay the decision later with the same input/policy, and debug in any environment using the usual tooling.

We are required by our Product & Solution Security Expert to mask out certain parts of the input. One of those being /input/attributes/request/http/headers as we do not want to full encoded token to be logged. We would however like to log certain things of the token we decode in our policy.

Yes, error is the recommended level to use on production. But there are situations where you need to (temporarily) change that to 'info' or even to 'debug' in order to investigate an issue on production which for whatever reason, you cannot simulate on another environment.

Unless your policies rely heavily on http.send, or environment variables from opa.runtime(), the environment should have no impact on the decisions made.

That is exactly our case. We have quite a few http.send usages in our policy as we heavily rely on a few PIP (Policy Information Point) endpoints.

With any other programming language I have used (C++, C#, java, nodejs), there are means to add log entries with different log levels. We feel this as a missing feature..

Right, but Rego isn't a general purpose programming language, but a purpose-built policy language. If you compare to other query languages you'll likely see many of the same characteristics there.

All that said — if there's a strong case for logging something that's not easily covered by the decision log module, we obviously should look into how to improve that.

We still feel the lack of being able to do some logging in several levels. Our policy code is quite complex and it can be really helpful to print / log certain things, also on production. Maybe we are the only ones. I was hoping to get some backup from other people :)

May 19 '22 14:05 TweeZz

We are required by our Product & Solution Security Expert to mask out certain parts of the input. One of those being /input/attributes/request/http/headers as we do not want to full encoded token to be logged. We would however like to log certain things of the token we decode in our policy.

Sure.. that sounds like decision log masking, no? :)

That is exactly our case. We have quite a few http.send usages in our policy as we heavily rely on a few PIP (Policy Information Point) endpoints.

This is probably what you'd want then.

We still feel the lack of being able to do some logging in several levels. Our policy code is quite complex and it can be really helpful to print / log certain things, also on production.

I'd still decision logging should cover that, but if you could provide an example where it wouldn't, that would be useful!

Maybe we are the only ones. I was hoping to get some backup from other people :)

Yeah, leaving this open for a while to see if others want to chip in :)

May 19 '22 18:05 anderseknert

Just circling back on this, did you find some time to look into decision logging yet, @TweeZz ?

May 30 '22 12:05 anderseknert

Hey..

Sure.. that sounds like decision log masking, no? :)

We are using decision log masking. We mask the complete http headers. mask["/input/attributes/request/http/headers"] But this does not help with what I wrote:

We would however like to log certain things of the token we decode in our policy

The http headers contain an encoded token. We would like to log certain claims of the decoded token.

Example:

# retrieve the decoded token from the encoded token
decoded_token = {"header": header, "payload": payload} {
    [header, payload, _] := io.jwt.decode(encoded_token)
   # For whatever reason, we'd like to log header.kid here. At times, also on production.
}

is_location_allowed {
    is_nav_token
    nav_pip_response := request_nav_pip
    nav_pip_response.status_code == 200
    count(nav_pip_response.body) == 1
} else {
    not is_nav_token
    iam_pip_response := request_iam_pip
    iam_pip_response.status_code == 200
    # For whatever reason, we'd like to log a part of the iam_pip_reponse here. At times, also on production.
}

https://github.com/open-policy-agent/opa/issues/1514 is probably what you'd want then.

Might be, yes.

Never mind. We will work with what is available. We're too much used to logging possibilities with other languages :)

May 30 '22 14:05 TweeZz

Thanks for expanding on that, @TweeZz 👍 Sounds like you'd want a way to include arbitrary values in the decision log rather than having logger that prints to the console?

May 31 '22 11:05 anderseknert

Thank you for responding on my musings 😉 Printing to the console is fine.. The existing print function is fine as it allows us to print arbitrary values. It just always logs as info. We'd like more flexibility there. That's all 🙂

May 31 '22 11:05 TweeZz

I would have another argument :)

For now, I still find writing policies in rego not easy, at times confusing. It's quite a different way of writing code as I'm used to :) I'm working on unit tests. It's super easy to make negative tests pass. It's less easy to make them pass correctly. I find myself adding tons of print statements in the code so I can "see" where things are happening and how. In C# or Java, most of those would be added as trace level logs. Already a few times, I removed such entries, because we don't want to have those logged in our DEV deployment. Then a few days later, when I continue to work on some unit tests, I start to add some print entries again. Having several log levels would make this more efficient.

Am I the only one missing log level support? :) My colleagues at Siemens also wouldn't mind :)

Jun 07 '22 08:06 TweeZz

Hi again 😃 If you have some examples of tests like that, I'd love to take a look!

There are a couple of open issues wrt having the test runner be more helpful in explaining where things went wrong, and to print the difference when assertions fail, and so on. I think there's still a lot to be done in order to help providing insights into what's going on in evaluation, and there are many options to explore here.

I think part of why "traditional" logging is problematic in Rego is because just like you said, Rego is different from what you're used to. Since Rego is a declarative language, taking a logging approach from imperative languages is likely going to cause as much confusion as it helps solve. Things like rule indexing and early exit optimizations might for example have the evaluator skip entire classes of rules, and log statements that "obviously" should should print something won't ever be considered for evaluation. This is already the case with the print built-in, and the evaluator actually behaves differently when print statements are present in rules just to help making debugging more "obvious", even if it means disabling optimizations. That's why print statements are effectively erased from policies when the log level is set to ERROR, which is the recommended level for production. If we didn't, simple print calls would have potentially large negative impact on performance.

Another (lesser) problem with "imperative" logging is that it easily leads to users making assumptions about the order of evaluation, since that's the expectation they'd bring from other (imperative) languages.

You can read more on the details of the print implementation in the OPA blog.

I don't think we disagree on the need to make the evaluation process more transparent, we'll just need to figure out how to best accomplish that :)

For testing perhaps you'll find this little library helpful? :)

Jun 07 '22 09:06 anderseknert

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

Jul 07 '22 12:07 stale[bot]

🧹 Closing this for now as it's not likely to be implemented. Good discussion above though for anyone interested!

Jan 05 '23 12:01 anderseknert

opa opa copied to clipboard

Add support to use print with a log level of choice.

What part of OPA would you like to see improved?

Describe the ideal solution

Describe a "Good Enough" solution

opa
opa copied to clipboard