pulumi-kubernetes-operator
pulumi-kubernetes-operator copied to clipboard
Pulumi Operator times out grabbing GitRepository from FluxSource
What happened?
I tried to integrate a Pulumi Program written in Go via a Flux source like described in here: https://www.pulumi.com/blog/pulumi-kubernetes-new-2022/#integration-with-flux-sources
Unfortunately this is failing due to a timeout-issue. The pulumi-operator times out grabbing the GitRepository from the Flux-Source-Controller like this:
{"level":"error","ts":"2023-07-15T11:46:44.275Z","logger":"controller_stack","msg":"Failed to setup Pulumi workdir","Request.Namespace":"pulumi-operator","Request.Name":"asgard-tst","Stack.Name":"tst","error":"failed to get artifact from source: failed to download archive, error: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/pulumi-operator/iac-asgard/044c892e78dd4f774bd349c31db790269ecc8f32.tar.gz giving up after 2 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/pulumi-operator/iac-asgard/044c892e78dd4f774bd349c31db790269ecc8f32.tar.gz\": dial tcp 10.43.196.204:80: i/o timeout","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}
{"level":"error","ts":"2023-07-15T11:46:44.275Z","logger":"controller_stack","msg":"Failed to update Stack","Request.Namespace":"pulumi-operator","Request.Name":"asgard-tst","Stack.Name":"tst","error":"failed to get artifact from source: failed to download archive, error: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/pulumi-operator/iac-asgard/044c892e78dd4f774bd349c31db790269ecc8f32.tar.gz giving up after 2 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/pulumi-operator/iac-asgard/044c892e78dd4f774bd349c31db790269ecc8f32.tar.gz\": dial tcp 10.43.196.204:80: i/o timeout","stacktrace":"github.com/pulumi/pulumi-kubernetes-operator/pkg/controller/stack.(*ReconcileStack).Reconcile\n\t/home/runner/work/pulumi-kubernetes-operator/pulumi-kubernetes-operator/pkg/controller/stack/stack_controller.go:687\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}
I was able to overcome this issue by adding this exact network policy for the flux source-controller:
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-pulumi-operator-source-grabbing
namespace: flux-system
spec:
podSelector:
matchLabels:
app: source-controller
ingress:
- ports:
- protocol: TCP
port: http
from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: pulumi-operator
- podSelector:
matchLabels:
name: pulumi-kubernetes-operator
policyTypes:
- Ingress
I'm not sure if my "outdated" flux or pulumi are the reason for this behavior, but they are set as follows:
Pulumi version in cluster: v3.68.0
Flux version: v0.41.2 (current: v2.0.1)
Expected Behavior
a chapter in the documentation describing on how to add the above network policy or a built-in way which does it for you.
Steps to reproduce
Just follow the documentation in here https://www.pulumi.com/blog/pulumi-kubernetes-new-2022/#integration-with-flux-sources with the above mentioned pulumi/flux versions.
Output of pulumi about
From within the pulumi-operator pod:
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
@kellervater Thanks for reporting this issue and providing the fix for this. You are indeed correct that a NetworkPolicy to enable ingress to Flux is required for the Pulumi Operator to interact with Flux's source-controller.
Looking at the examples we have in our Operator codebase, we explicitly enable ingress to Flux components when we install Flux using the Pulumi Flux provider. The default installation of Flux does not allow ingress, which likely explains the initial error you experienced.
I'll ensure that this requirement is documented somewhere. Thanks for bringing this up once again!
Added to epic https://github.com/pulumi/pulumi-kubernetes-operator/issues/586
Thanks again @kellervater for the report. Flux artifacts are downloaded from the source-controller in flux-system namespace by the operator pod in pulumi-kubernetes-operator namespace. Your policy example works well to allow the traffic.
Good news everyone, we just release a preview of Pulumi Kubernetes Operator v2. This new release has a whole-new architecture that uses pods as the execution environment. We updated the documentation to include instructions about setting up a NetworkPolicy if you're using Flux. Thanks again @kellervater for the help.
Please read the announcement blog post for more information: https://www.pulumi.com/blog/pulumi-kubernetes-operator-2-0/
Would love to hear your feedback! Feel free to engage with us on the #kubernetes channel of the Pulumi Slack workspace.