pulumi-kubernetes
pulumi-kubernetes copied to clipboard
Server-side apply deletes resource if provider changes
I've got a simpler reproduction and explanation, but I'll leave the original issue text below for historical context.
What happened?
Resource replacement erroneously deletes the resource when the following conditions are true:
- Server-side apply mode is enabled
- The resource provider changes to another provider pointing at the same cluster (e.g., a default provider changes to an explicit provider)
This does not happen in Client-side apply mode because the resource creation fails during replacement due to the resource already existing.
In Server-side apply mode, the create succeeds even if the resource existed previously. This behavior was intended to support "upsert" behavior. Unfortunately, the "delete" portion of the resource replacement now removes the resource, causing it to be deleted from the cluster. (deleteBeforeReplace still works as expected, but is not an option for CustomResourceDefinition resources, since this will also delete related CustomResource resources on the cluster.)
Further complicating this problem, the engine, rather than the provider, plans the resource replacement based on the provider changing. Resource aliases do not currently support aliasing providers. It is possible to manually drop the resource from state and then import it with the new provider, but this requires changes outside of the program code, so it may not be an option for some users.
Expected Behavior
The provider should not delete the resource that it is expected to replace.
Steps to reproduce
pulumi config set kubernetes:enableServerSideApply truepulumi upthe following program
import * as k8s from "@pulumi/kubernetes";
const provider = new k8s.Provider("k8s", {
enableServerSideApply: true,
});
new k8s.core.v1.ConfigMap("test", {
metadata: {name: "test"},
data: {foo: "bar"}
}, {
provider,
});
- Comment out the
provideroption on theConfigMap pulumi upshows that the resource will be replaced, but it will actually delete the resource- Running
pulumi up --refreshdetects that theConfigMapis missing, and recreates it
Output of pulumi about
Dependencies:
NAME VERSION
@pulumi/pulumi 3.61.0
@pulumi/kubernetes 3.23.0
Related https://github.com/pulumi/pulumi/issues/9925 Related https://github.com/pulumi/pulumi-aws/issues/1923#issuecomment-1184890624
What happened?
We pass in { provider: k8sProvider} as part of a crd in the options part(last line of code): ...const foobarcrd = new k8s.apiextensions.v1.CustomResourceDefinition(foobarcrd). Then if we remove the { provider: k8sProvider}, pulumi preview shows:
View in Browser (Ctrl+O): https://app.pulumi.com/tushar-pulumi-corp/aws-classic-ts-eks-spot-mg/dev/previews/259d842b-1a54-410c-83f8-7a0420e07b58
Type Name Plan Info
pulumi:pulumi:Stack aws-classic-ts-eks-spot-mg-dev
+- └─ kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition shaht-foobarcrd replace [diff: ~provider]
Resources:
+-1 to replace
60 unchanged
because of the diff in the provider
The diff shows this:
++kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition: (create-replacement)
[id=foobars.stable.example.com]
[urn=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition::shaht-foobarcrd]
[provider: urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::shaht-k8sprovider::acfa6c62-f920-40a3-9587-61e4ab3b1c33 => urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::default_3_25_0::output<string>]
+-kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition: (replace)
[id=foobars.stable.example.com]
[urn=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition::shaht-foobarcrd]
and if we select yes we get the following error:
View in Browser (Ctrl+O): https://app.pulumi.com/tushar-pulumi-corp/aws-classic-ts-eks-spot-mg/dev/updates/84
Type Name Status Info
pulumi:pulumi:Stack aws-classic-ts-eks-spot-mg-dev **failed** 1 error
├─ eks:index:Cluster shaht-eks
│ └─ aws:eks:Cluster shaht-eks-eksCluster
+- └─ kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition shaht-foobarcrd **replacing failed** [diff: ~provider]; 1 error
Diagnostics:
pulumi:pulumi:Stack (aws-classic-ts-eks-spot-mg-dev):
error: update failed
kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition (shaht-foobarcrd):
error: resource foobars.stable.example.com was not successfully created by the Kubernetes API server : customresourcedefinitions.apiextensions.k8s.io "foobars.stable.example.com" already exists
Outputs:
- cluster_name : "shaht-eks-eksCluster-ae19d30"
- cluster_verion : "1.21"
- kubeconfig : [secret]
- managed_nodegroup_capacitytype: "SPOT"
- my_crd : "foobars.stable.example.com"
- my_namespace : "shaht-namespace-ad7d4939"
Expected Behavior
The crd should be replaced even if it is across different providers.
Steps to reproduce
- Clone this repro: [email protected]:tusharshahrs/pulumi-home.git
cd aws-classic-ts-eks-spot-mgpulumi stack init devnpm installpulumi config set aws:region us-east-2# any valid aws regionpulumi up -y- validate that the crd is there:
pulumi stack output my_crdexpected output:foobars.stable.example.com - Comment out the provider on line 105: https://github.com/tusharshahrs/pulumi-home/blob/identityprofile/aws-classic-ts-eks-spot-mg/index.ts#L105 like this:
// { provider: k8sProvider, dependsOn: [namespace, mycluster]}); - Uncomment out the line below for the default provider to get picked up: https://github.com/tusharshahrs/pulumi-home/blob/identityprofile/aws-classic-ts-eks-spot-mg/index.ts#L105
like this:
{dependsOn: [namespace, mycluster]}); pulumi upand see that the preview will show that the crd is beingreplaced
View in Browser (Ctrl+O): https://app.pulumi.com/tushar-pulumi-corp/aws-classic-ts-eks-spot-mg/dev/previews/d642dc85-49ad-4e41-9149-b65f964790c6
Type Name Plan Info
pulumi:pulumi:Stack aws-classic-ts-eks-spot-mg-dev
+- └─ kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition shaht-foobarcrd replace [diff: ~provider]
Resources:
+-1 to replace
60 unchanged
Do you want to perform this update? details
pulumi:pulumi:Stack: (same)
[urn=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:pulumi:Stack::aws-classic-ts-eks-spot-mg-dev]
++kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition: (create-replacement)
[id=foobars.stable.example.com]
[urn=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition::shaht-foobarcrd]
[provider: urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::shaht-k8sprovider::acfa6c62-f920-40a3-9587-61e4ab3b1c33 => urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::default_3_25_0::output<string>]
apiVersion: "apiextensions.k8s.io/v1"
kind : "CustomResourceDefinition"
metadata : {
labels: {
app.kubernetes.io/managed-by: "pulumi"
}
name : "foobars.stable.example.com"
}
spec : {
group : "stable.example.com"
names : {
kind : "FooBar"
plural : "foobars"
shortNames: [
[0]: "fb"
]
singular : "foobar"
}
scope : "Namespaced"
versions: [
[0]: {
name : "v1"
schema : {
openAPIV3Schema: {
type: "object"
}
}
served : true
storage: true
}
]
}
+-kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition: (replace)
[id=foobars.stable.example.com]
[urn=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition::shaht-foobarcrd]
[provider: urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::shaht-k8sprovider::acfa6c62-f920-40a3-9587-61e4ab3b1c33 => urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::default_3_25_0::output<string>]
~ metadata : {
- creationTimestamp: "2023-04-27T12:27:25Z"
- generation : 1
- managedFields : [
- [0]: {
- apiVersion: "apiextensions.k8s.io/v1"
- fieldsType: "FieldsV1"
- fieldsV1 : {
- f:status: {
- f:acceptedNames: {
- f:kind : {}
- f:listKind : {}
- f:plural : {}
- f:shortNames: {}
- f:singular : {}
}
- f:conditions : {
- k:{"type":"Established"} : {
- . : {}
- f:lastTransitionTime: {}
- f:message : {}
- f:reason : {}
- f:status : {}
- f:type : {}
}
- k:{"type":"NamesAccepted"}: {
- . : {}
- f:lastTransitionTime: {}
- f:message : {}
- f:reason : {}
- f:status : {}
- f:type : {}
}
}
}
}
- manager : "kube-apiserver"
- operation : "Update"
- time : "2023-04-27T12:27:25Z"
}
- [1]: {
- apiVersion: "apiextensions.k8s.io/v1"
- fieldsType: "FieldsV1"
- fieldsV1 : {
- f:metadata: {
- f:labels: {
- . : {}
- f:app.kubernetes.io/managed-by: {}
}
}
- f:spec : {
- f:conversion: {
- . : {}
- f:strategy: {}
}
- f:group : {}
- f:names : {
- f:kind : {}
- f:listKind : {}
- f:plural : {}
- f:shortNames: {}
- f:singular : {}
}
- f:scope : {}
- f:versions : {}
}
}
- manager : "pulumi-kubernetes"
- operation : "Update"
- time : "2023-04-27T12:27:25Z"
}
]
- resourceVersion : "1085"
- uid : "3d55882f-239e-418c-a2ec-416f4b407b70"
}
~ spec : {
- conversion: {
- strategy: "None"
}
~ names : {
- listKind : "FooBarList"
}
}
- status : {
- acceptedNames : {
- kind : "FooBar"
- listKind : "FooBarList"
- plural : "foobars"
- shortNames: [
- [0]: "fb"
]
- singular : "foobar"
}
- conditions : [
- [0]: {
- lastTransitionTime: "2023-04-27T12:27:25Z"
- message : "no conflicts found"
- reason : "NoConflicts"
- status : "True"
- type : "NamesAccepted"
}
- [1]: {
- lastTransitionTime: "2023-04-27T12:27:25Z"
- message : "the initial names have been accepted"
- reason : "InitialNamesAccepted"
- status : "True"
- type : "Established"
}
]
- storedVersions: [
- [0]: "v1"
]
}
--kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition: (delete-replaced)
[id=foobars.stable.example.com]
[urn=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition::shaht-foobarcrd]
[provider=urn:pulumi:dev::aws-classic-ts-eks-spot-mg::pulumi:providers:kubernetes::shaht-k8sprovider::acfa6c62-f920-40a3-9587-61e4ab3b1c33]
- Select
yesand the following error will show up:
+- └─ kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition shaht-foobarcrd **replacing failed** [diff: ~provider]; 1 error
Diagnostics:
pulumi:pulumi:Stack (aws-classic-ts-eks-spot-mg-dev):
error: update failed
kubernetes:apiextensions.k8s.io/v1:CustomResourceDefinition (shaht-foobarcrd):
error: resource foobars.stable.example.com was not successfully created by the Kubernetes API server : customresourcedefinitions.apiextensions.k8s.io "foobars.stable.example.com" already exists
Outputs:
- cluster_name : "shaht-eks-eksCluster-ae19d30"
- cluster_verion : "1.21"
- kubeconfig : [secret]
- managed_nodegroup_capacitytype: "SPOT"
- my_crd : "foobars.stable.example.com"
- my_namespace : "shaht-namespace-ad7d4939"
Resources:
60 unchanged
Duration: 9s
Output of pulumi about
❯ pulumi about
CLI
Version 3.65.1
Go Version go1.20.3
Go Compiler gc
Plugins NAME VERSION aws 5.36.1 aws 5.16.2 awsx 1.0.2 docker 3.6.1 eks 1.0.1 kubernetes 3.25.0 nodejs unknown
Host
OS darwin
Version 11.7.6
Arch x86_64
Found no pending operations associated with dev
Backend
Name pulumi.com
URL https://app.pulumi.com/tushar-..-..
User tushar-..-..
Dependencies: NAME VERSION @pulumi/aws 5.36.1 @pulumi/awsx 1.0.2 @pulumi/eks 1.0.1 @pulumi/kubernetes 3.25.0 @pulumi/pulumi 3.64.0 @types/node 16.18.24
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
The issue looks like it's due to our pulumi-kubernetes thinking that a new cluster is being targeted. Hence, it will attempt to create the resources again. When it does so, it uses a imperative create method, that will fail since the cluster already has the resources applied earlier. We should be doing something like "create or update" in events like these, since the failure on creation shouldn't be important if it's due to the resource already existing.
Note that issue not only affects CRDs, but any Kubernetes resources.
We can workaround this issue by using server side apply instead, and it will not error out since a PATCH request is sent instead in SSA mode.
Relevant code chunks:
In client side mode, we do a client.Create: https://github.com/pulumi/pulumi-kubernetes/blob/dae363925ac5d7b613afeb2fca1ffb1999c7ee83/provider/pkg/await/await.go#L221
Whereas SSA does a PATCH request: https://github.com/pulumi/pulumi-kubernetes/blob/dae363925ac5d7b613afeb2fca1ffb1999c7ee83/provider/pkg/await/await.go#L206
For a fix to happen/next steps, we would need to implement logic to determine when a provider is being switched and use a patch request instead of a create request.
I updated the issue description with new information about the problem.
After further investigation, there does not appear to be an easy solution to this bug. The same issue affecting SSA-managed resources also impacts a few other resources.
There are a few possible solutions that require further investigation:
- Add support for provider aliasing to tell the engine that two providers target the same thing
- Coordinate between a provider to determine if the targeted cloud/cluster is the same, and the engine, which would eliminate the need for an explicit alias
- https://github.com/pulumi/pulumi/issues/2625
In the meantime, our recommendation is the following:
- A resource's provider should not be changed as part of an update
- If the provider needs to be changed, the resource should be dropped from Pulumi state, and then reimported under the new provider.
Related https://github.com/pulumi/pulumi-kubernetes/issues/2948
I've also run into this issue following a provider update, due to certificates in the kubeconfig being renewed. I was able to work around it with some stack surgery (manually editing the stack to contain the new kubeconfig, so the change isn't detected).
Unfortunately, I don't think either of recommendations is feasible for us:
A resource's provider should not be changed as part of an update
We need to change the kubeconfig, because the old one is no longer valid (expired certificate).
If the provider needs to be changed, the resource should be dropped from Pulumi state, and then reimported under the new provider.
We have several copies of this stack, each with several hundred resources. Most of the resources have names auto-generated by Pulumi, so we'd also have to find the correct name to import for each stack (possible, but not trivial).
I think implementing https://github.com/pulumi/pulumi-kubernetes/issues/2745 could help since it would provide a way to stop provider from being replaced
I'm reassigning to Bryce since he's working on #2745 and that should resolve this one.
we're hitting this as well, thanks for the helpful explanations above