gatekeeper-library icon indicating copy to clipboard operation
gatekeeper-library copied to clipboard

Fix replicalimits ConstraintTemplate to handle scaling to zero properly. Fixes #419

Open skaven81 opened this issue 2 years ago • 11 comments

What this PR does / why we need it: The replicalimits ConstraintTemplate has two bugs that are fixed in this PR. First, the input_replica_limit() function contained a reference to global input.review.object.spec instead of using the local spec argument. Second, when kubectl scale <resource> --replicas=0 is issued, a Scale resource is created with an empty spec: { }, not spec: { replicas: 0 }. Using object.get() with a default value of 0 ensures that the ConstraintTemplate works properly in this condition.,

Which issue(s) does this PR fix: Fixes #419

skaven81 avatar Oct 27 '23 00:10 skaven81

I'm going to need some help here. I thought that I followed the instructions properly for modifying the source code. But I found that make generate-all doesn't seem to do anything at all with the src_test.rego, so I manually inserted my updated tests into the library tree. I suspect that is what is causing the CI tests to fail. But as I can't seem to get the tests to build at all using make generate-all, how do I properly validate this in my own working copy?

skaven81 avatar Oct 27 '23 18:10 skaven81

I'm going to need some help here. I thought that I followed the instructions properly for modifying the source code. But I found that make generate-all doesn't seem to do anything at all with the src_test.rego, so I manually inserted my updated tests into the library tree. I suspect that is what is causing the CI tests to fail. But as I can't seem to get the tests to build at all using make generate-all, how do I properly validate this in my own working copy?

Hi @skaven81 the separate ./test.sh script should run all tests, however from the CI it initially appears the website documentation updates didn't occur for some reason, which should be included in make generate-all.

apeabody avatar Oct 27 '23 18:10 apeabody

I'm going to need some help here. I thought that I followed the instructions properly for modifying the source code. But I found that make generate-all doesn't seem to do anything at all with the src_test.rego, so I manually inserted my updated tests into the library tree. I suspect that is what is causing the CI tests to fail. But as I can't seem to get the tests to build at all using make generate-all, how do I properly validate this in my own working copy?

Hi @skaven81 the separate ./test.sh script should run all tests, however from the CI it initially appears the website documentation updates didn't occur for some reason, which should be included in make generate-all.

diff --git a/website/docs/validation/replicalimits.md b/website/docs/validation/replicalimits.md
index 4f458dd..343e990 100644
--- a/website/docs/validation/replicalimits.md
+++ b/website/docs/validation/replicalimits.md
@@ -92,6 +92,8 @@ spec:
     kinds:
       - apiGroups: ["apps"]
         kinds: ["Deployment"]
+      - apiGroups: ["autoscaling"]
+        kinds: ["Scale"]
   parameters:
     ranges:
     - min_replicas: 3
@@ -139,6 +141,26 @@ Usage
 kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits/example_allowed.yaml

+ +

+example-scale-allowed + +```yaml +apiVersion: autoscaling/v1 +kind: Scale +metadata:

  • name: allowed-deployment +spec:
  • replicas: 3

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits/example_scale_allowed.yaml +``` +

example-disallowed @@ -172,6 +194,168 @@ Usage kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits/example_disallowed.yaml ```

+

+
+example-scale-disallowed + +```yaml +apiVersion: autoscaling/v1 +kind: Scale +metadata:
  • name: allowed-deployment +spec:
  • replicas: 100

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits/example_scale_disallowed.yaml + + +</details> + + +</blockquote></details><details> +<summary>replica-limit-zero</summary><blockquote> + +<details> +<summary>constraint</summary> + +yaml +apiVersion: constraints.gatekeeper.sh/v1beta1 +kind: K8sReplicaLimits +metadata:

  • name: replica-limits +spec:
  • match:
  • kinds:
  •  - apiGroups: ["apps"]
    
  •    kinds: ["Deployment"]
    
  •  - apiGroups: ["autoscaling"]
    
  •    kinds: ["Scale"]
    
  • parameters:
  • ranges:
    • min_replicas: 0
  •  max_replicas: 50
    

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits_zero/constraint.yaml + + +</details> + +<details> +<summary>example-allowed</summary> + +yaml +apiVersion: apps/v1 +kind: Deployment +metadata:

  • name: allowed-deployment +spec:
  • selector:
  • matchLabels:
  •  app: nginx
    
  • replicas: 0
  • template:
  • metadata:
  •  labels:
    
  •    app: nginx
    
  • spec:
  •  containers:
    
  •  - name: nginx
    
  •    image: nginx:1.14.2
    
  •    ports:
    
  •    - containerPort: 80
    

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits_zero/example_allowed.yaml + + +</details> +<details> +<summary>example-scale-allowed</summary> + +yaml +apiVersion: autoscaling/v1 +kind: Scale +metadata:

  • name: allowed-deployment +# kubectl scale deploy --replicas=0 creates a Scale +# resource with an empty spec, not replicas:0 +spec: {}

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits_zero/example_scale_allowed.yaml + + +</details> +<details> +<summary>example-disallowed</summary> + +yaml +apiVersion: apps/v1 +kind: Deployment +metadata:

  • name: disallowed-deployment +spec:
  • selector:
  • matchLabels:
  •  app: nginx
    
  • replicas: 100
  • template:
  • metadata:
  •  labels:
    
  •    app: nginx
    
  • spec:
  •  containers:
    
  •  - name: nginx
    
  •    image: nginx:1.14.2
    
  •    ports:
    
  •    - containerPort: 80
    

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits_zero/example_disallowed.yaml + + +</details> +<details> +<summary>example-scale-disallowed</summary> + +yaml +apiVersion: autoscaling/v1 +kind: Scale +metadata:

  • name: allowed-deployment +spec:
  • replicas: 100

+ + +Usage + +shell +kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replicalimits/samples/replicalimits_zero/example_scale_disallowed.yaml +``` +

Please run 'make generate generate-website-docs generate-artifacthub-artifacts' to generate the templates and docs

apeabody avatar Oct 27 '23 18:10 apeabody

When I run make generate-artifacthub-artifacts (which is part of make generate-all) it creates a new directory with hundreds of files in it, that weren't in the original repository. It didn't seem right to add that to the commit (I'm pretty sure that's part of what I did wrong in my prior PR).

The website docs did get regenerated, and they're in my commit: https://github.com/open-policy-agent/gatekeeper-library/pull/427/commits/33b39e14c906837b5bd409e63a9cf878890893bc#diff-578c418f3682058d7a372ca0fd2d3dafb3cd57c31df54872bbc81890ad63b261

There's something missing from the contribution process that regular contributors must be doing without thinking about it, but isn't explicit in the documentation.

Here's what I did:

  1. Edit src/general/replicalimits/ files, updating version from 1.0.1 to 1.1.0, applying the constraint template code update, and tweaking the test generation Rego to account for the expected test behavior.
  2. Run make generate-all - this generated updates to library/general/replicalimits/template.yaml and website/docs/validation/replicalimits.md, but did nothing to the tests under library/general/replicalimits/.
  3. Add my own updated tests under library/general/replicalimits/ since the build process didn't update them.
  4. Commit and push, open PR.

Step 3 was almost certainly wrong. But the documentation (https://github.com/open-policy-agent/gatekeeper-library#development) makes no mention of how to build the tests, and the Makefile doesn't seem to have anything in there that touches src_test.rego files. Also the documentation isn't explicitly for "you need to do these things, in this order, to contribute to this project" ... it's just a generalized list of tools available while developing in the repo. The documentation above (https://github.com/open-policy-agent/gatekeeper-library#new-policy) is for adding a new policy ... and that does mention src_test.rego but also makes no mention of what commands need to be run to prepare the repo for a PR.

skaven81 avatar Oct 27 '23 20:10 skaven81

When I run make generate-artifacthub-artifacts (which is part of make generate-all) it creates a new directory with hundreds of files in it, that weren't in the original repository. It didn't seem right to add that to the commit (I'm pretty sure that's part of what I did wrong in my prior PR).

The website docs did get regenerated, and they're in my commit: 33b39e1#diff-578c418f3682058d7a372ca0fd2d3dafb3cd57c31df54872bbc81890ad63b261

There's something missing from the contribution process that regular contributors must be doing without thinking about it, but isn't explicit in the documentation.

Here's what I did:

  1. Edit src/general/replicalimits/ files, updating version from 1.0.1 to 1.1.0, applying the constraint template code update, and tweaking the test generation Rego to account for the expected test behavior.
  2. Run make generate-all - this generated updates to library/general/replicalimits/template.yaml and website/docs/validation/replicalimits.md, but did nothing to the tests under library/general/replicalimits/.
  3. Add my own updated tests under library/general/replicalimits/ since the build process didn't update them.
  4. Commit and push, open PR.

Step 3 was almost certainly wrong. But the documentation (https://github.com/open-policy-agent/gatekeeper-library#development) makes no mention of how to build the tests, and the Makefile doesn't seem to have anything in there that touches src_test.rego files. Also the documentation isn't explicitly for "you need to do these things, in this order, to contribute to this project" ... it's just a generalized list of tools available while developing in the repo. The documentation above (https://github.com/open-policy-agent/gatekeeper-library#new-policy) is for adding a new policy ... and that does mention src_test.rego but also makes no mention of what commands need to be run to prepare the repo for a PR.

Hi @skaven81 - Sorry any confusion, the make generate-all step should be run after all changes (swap steps 2 and 3), as the tests are also used to build the template documentation. Unlike the template generation, the Rego tests under src/general/replicalimits/ do not build the KRM tests under library/general/replicalimits/, unfortunatly they need to be added independently.

I did a quick test run of make generate-all on your branch and saw 14 new files added to artifacthub/library/general/replicalimits/1.1.0 and several updates to website/docs/validation/replicalimits.md - both expected. If you are still seeing something different let me know, all the other CI tests (e.g. those which exercise the tests, etc) currently look good.

apeabody avatar Oct 27 '23 20:10 apeabody

OK I re-ran make generate-all and added the additional changes to the website and artifacthub bits. Let's see if the CI tests pass this time :crossed_fingers:

skaven81 avatar Oct 27 '23 22:10 skaven81

I don't know why we keep dancing back and forth on this. Every time I fix my branch so that the tests all complete successfully, somebody comes along and does a merge from master which reverts some of my changes, and the tests start failing again.

Somebody please just spend 10 minutes to manually review the actual PR and merge it in, instead of depending on the automation alone.

skaven81 avatar Jan 03 '24 16:01 skaven81

@skaven81 Thank you for the PR and apologies for the back and forth. The merge from main didnt show any conflicts. If you see any, can you please help update the conflicts? We will review this today.

Error from CI:

looks like template.yaml is updated but the version is not. Please update the 'metadata.gatekeeper.sh/version' annotation in the template.yaml source

Looks like a previously merged PR updated the template version https://github.com/open-policy-agent/gatekeeper-library/pull/429/files#diff-c3f23f7cfeabf9bf68a4be9f0d84eb695dcaf8b31f3cdcbdaaa6aa269fcbb823L7. Can you please bump it again? Thank you!

ritazh avatar Jan 03 '24 16:01 ritazh

@apeabody other than the template version, LGTY?

ritazh avatar Jan 03 '24 18:01 ritazh

@apeabody other than the template version, LGTY?

Thanks @ritazh, agreed!

Subsequent to bumping the template version will need a make generate-all. I also believe the stale changes under artifacthub/library/general/replicalimits/1.0.2/ will need to be manually reverted.

apeabody avatar Jan 03 '24 19:01 apeabody

This issue/PR has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 03 '24 19:03 stale[bot]