pulumi-kubernetes
pulumi-kubernetes copied to clipboard
Fix: Missing Kind in preview mode (yaml/v2)
Proposed changes
The yaml/v2 package eagerly resolves kinds to apply a default namespace. This PR addresses a problem with the use case
where a kind is installed by one ConfigGroup and then used by another. In preview mode, the kind cannot be found by the latter group because the expected side-effect (i.e. installation of a CRD) doesn't occur.
The solution to the issue is to maintain a cache in the provider of new CRDs. When the provider executes Check on a CustomResourceDefinition resource, it adds the CRD to the cache. Later, when a component resource must resolve a given kind, it checks the cache. In this way, the cache compensates for the lack of side-effects (i.e. API server calls) that would normally allow the kind to be resolved. Overall ordering is established using DependsOn.
It is reasonable to populate the cache during Check because a)Check produces the "planned" state that is needed later without actuating it, and b) Update is called conditionally and thus isn't a reliable alternative.
Implementation Details
The provider has three overlapping modes of operation - preview-mode, clusterUnreachable and yamlRenderMode - that affect the initialization and use of clientSet. Previously when clusterUnreachable is true then the clientSet is left null. Now, we new-up the clientSet itself but leave its various fields null. This change makes it possible to use the CRD cache in all modes, as was necessary to support yamlRenderMode.
Testing
A new integration test is added called yamlv2 that exercises this scenario.
Manual testing of "cluster unreachable mode" was done by misconfiguring my local environment (to have an invalid kube context). Tested in combination with "yaml rendering mode" and found that the latter works as expected even when the cluster is unreachable.
Related issues (optional)
Does the PR have any schema changes?
Looking good! No breaking changes found. No new resources/functions.
Codecov Report
Attention: Patch coverage is 41.79104% with 39 lines in your changes are missing coverage. Please review.
Project coverage is 27.63%. Comparing base (
a15641e) to head (28c6912).
Additional details and impacted files
@@ Coverage Diff @@
## master #2904 +/- ##
==========================================
+ Coverage 27.55% 27.63% +0.08%
==========================================
Files 53 54 +1
Lines 7818 7862 +44
==========================================
+ Hits 2154 2173 +19
- Misses 5485 5504 +19
- Partials 179 185 +6
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I had a discussion with @Frassle where he expressed support but advised me to submit some tests into pu/pu to harden the contract.
I don't have enough context to feel strongly one way or the other. I do prefer the UX, but my default position is to usually err on the side of simpler-is-better when brushing up against system boundaries like this. If folks from the Platform side are on board with this then it seems like a no brainer.
There are a couple of arguments against this (or adding state anywhere in providers for that matter).
- Is it makes debug attach trickier because you probably now need a clear "Reset" signal to tell the long running attached provider process that it's time to clear its state and start again because a new deployment cycle is happening.
- It's less performant because we can't share the provider process across the preview and up phases of
up(well I suppose we could if we had a supported Reset method like for 1).
But I suspect providers can do much smarter previews if they are stateful, and that's probably worth the downsides of the above.
Just want to mention that the Cancel RPC method could maybe serve as a reset signal. Today, Cancel is not called unless the user cancels the deployment, but it is probably harmless to call at the end of every run, and could reset the provider state for another run. To the point about attaching to a long-running provider, cancellation is a further complication to that story unless the provider takes pains to 'reset' (p-k doesn't; it simply closes a channel).