pulumi-kubernetes-operator icon indicating copy to clipboard operation
pulumi-kubernetes-operator copied to clipboard

Operator keeps writing log files

Open skdltmxn opened this issue 2 years ago • 6 comments

What happened?

After upgrading the operator to 1.11.2, I see many log files are generated in /tmp.
It seems like there are three types of log: INFO, WARNING and ERROR.
Unlike https://github.com/pulumi/pulumi/issues/12263#issue-1597281168, where pipes didn't actually take any space, these log files do take volume spaces.
I checked what was actually written to the files and found that all logs have same content.

Log file created at: 2023/03/11 06:36:31
Running on machine: pulumi-kubernetes-operator-756d7b6c98-znfp7
Binary: Built with gc go1.20.1 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0311 06:36:31.070780    6999 log.go:84] GitHub rate limit exceeded, try again in 23m37.929230884s. You can set GITHUB_TOKEN to make an authenticated request with a higher rate limit.
E0311 06:36:31.080911    6999 log.go:84] GitHub rate limit exceeded, try again in 23m37.919093344s. You can set GITHUB_TOKEN to make an authenticated request with a higher rate limit.

Here are two questions;

  1. Why do I see GitHub rate limit error from 1.11.2? There were no such errors in previous version and I think those errors actually don't affect my stack reconcilation.
  2. I see the same errors for all types of log files: INFO, WARNING and ERROR. Why INFO and WARNING log files also contain error logs?

Following is screen capture of my /tmp directory. image

Expected Behavior

There should be no redundant log files

Steps to reproduce

Run pulumi-kubernetes-operator v1.11.2 with constant stack reconcilation enabled.

Output of pulumi about

CLI
Version      3.57.1
Go Version   go1.20.1
Go Compiler  gc

Host
OS       debian
Version  11.6
Arch     x86_64

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

skdltmxn avatar Mar 11 '23 07:03 skdltmxn

Tried --logtostderr option but doesn't work...

skdltmxn avatar Mar 11 '23 17:03 skdltmxn

Assigning to @squaremo for further triage who has more context about the v1.11.2 release.

rquitales avatar Mar 13 '23 07:03 rquitales

These logs are produced by pulumi, which is (via the "Automation API") exec'ed by the operator. I updated both the API and the pulumi executable in the image, but didn't change how they are used, which makes me think it's a change in pulumi that causes this. I'll go looking.

Incidental question: are you using the GitHub provider in your program?

squaremo avatar Mar 13 '23 14:03 squaremo

These logs are produced by pulumi, which is (via the "Automation API") exec'ed by the operator. I updated both the API and the pulumi executable in the image, but didn't change how they are used, which makes me think it's a change in pulumi that causes this. I'll go looking.

Incidental question: are you using the GitHub provider in your program?

If you are asking I'm using Github for my program repo, then yes, I'm using Github enterprise edition.

skdltmxn avatar Mar 13 '23 14:03 skdltmxn

Sorry, I should have been clearer: I meant are you using https://www.pulumi.com/registry/packages/github/ in your Pulumi stack, e.g.,

import * as github from "@pulumi/github";

const repo = new github.Repository("demo-repo", {
  description: "Generated from automated test",
  visibility: "private",
});

// etc.

I can think of three things that might be provoking a rate-limiting error from GitHub:

  • git operations, like cloning
  • using the GitHub provider programmatically in a pulumi stack
  • pulumi downloading a plugin, which accesses the GitHub API.

I'm just trying to figure out which of these it might be that's causing the message. I see at least one change relating to GitHub authorisation, in pulumi/pulumi: https://github.com/pulumi/pulumi/pull/12392 for one.

squaremo avatar Mar 13 '23 14:03 squaremo

No, I'm not using github provider.
Since there is no problem on syncing my infra, I guess it's due to plugin download?

skdltmxn avatar Mar 13 '23 14:03 skdltmxn

Added to epic https://github.com/pulumi/pulumi-kubernetes-operator/issues/586

cleverguy25 avatar Oct 29 '24 19:10 cleverguy25

Update: the situation has improved in Operator v2 but isn't completely resolved. It has a whole-new architecture that uses pods as the execution environment. The logs for a given stack are isolated into that pod, but there are still some log files that aren't cleaned up over time.

Please read the announcement blog post for more information on v2: https://www.pulumi.com/blog/pulumi-kubernetes-operator-2-0/

Would love to hear your feedback! Feel free to engage with us on the #kubernetes channel of the Pulumi Slack workspace.

EronWright avatar Oct 29 '24 19:10 EronWright

I believe this issue is no longer relevant after the re-architecture we did in PKO v2. Testing on a long running (4 days) Workspace pod, I have not noticed any Pulumi logs being persisted to disk.

CLI output to verify:

k exec -i nginx-stack-workspace-0 --tty -- sh
Defaulted container "pulumi" out of: pulumi, bootstrap (init), fetch (init)
$ ls /tmp
node-compile-cache		    python-build.20250131150322.145.log  python-build.20250131150557.229.log  tmp.aZLKaSwRTp	     tmpuzq9z9secacert.pem
python-build.20250131150200.91.log  python-build.20250131150440.187.log  python-build.20250131150708.271.log  tmpa4kf4ji_cacert.pem

rquitales avatar Feb 08 '25 00:02 rquitales