awx-operator icon indicating copy to clipboard operation
awx-operator copied to clipboard

Resources for migration-job.

Open norbertgrz opened this issue 1 year ago • 10 comments

Please confirm the following

  • [X] I agree to follow this project's code of conduct.
  • [X] I have checked the current issues for duplicates.
  • [X] I understand that AWX Operator is open source software provided for free and that I might not receive a timely response.

Feature Summary

As we are working on namespaces with quota limitations on our kubernetes cluster, we are unable to deploy awx-operator. Migration job created by operator has no resources specified and this is why pod is not created.

(combined from similar events): Error creating: pods "test-migration-24.3.0-cnkfj" is forbidden: failed quota: default-knlbg: must specify limits.cpu for: migration-job; limits.memory for: migration-job; requests.cpu for: migration-job; requests.memory for: migration-job

Is there a posibility to add resources to migration job template?

norbertgrz avatar Apr 25 '24 12:04 norbertgrz

Have you tried setting a LimitRange?

apiVersion: v1
kind: LimitRange
metadata:
  name: limit-mem-cpu-per-pod
  namespace: my-ns
spec:
  limits:
  - max:
      cpu: "800m"
      memory: "1Gi"
    min:
      cpu: "100m"
      memory: "99Mi"
    type: Pod

That would ensure that every pod has pre-defined resources without overwriting existing resource specifications.

JMora77 avatar May 06 '24 19:05 JMora77

Have you tried setting a LimitRange?

apiVersion: v1
kind: LimitRange
metadata:
  name: limit-mem-cpu-per-pod
  namespace: my-ns
spec:
  limits:
  - max:
      cpu: "800m"
      memory: "1Gi"
    min:
      cpu: "100m"
      memory: "99Mi"
    type: Pod

That would ensure that every pod has pre-defined resources without overwriting existing resource specifications.

Yes, this is what has been done. But i'm not a cluster admin. Cluster was prepared by our K8S admins, we have limited access to configuration and also custom permissions to install operator.

norbertgrz avatar May 07 '24 06:05 norbertgrz

I see.

If I am not mistaken there is a ResourceQuota on the namespace you are working on. You can contact your k8s admins to enable a LimitRange, and specify resources for all pods that do not already have them.

Adding resources to the migration job is a good solution and I hope it is implemented, but there are also other pods (like the ones running AWX jobs) that will also have the same error.

The LimitRange solution has been working in my case.

JMora77 avatar May 07 '24 11:05 JMora77

As i mentioned before, limitRange was already set on my cluster. About running jobs, isn't this setting ee_resource_requirements responsible for assigning resources to running execution environments?

norbertgrz avatar May 07 '24 12:05 norbertgrz

I thought so as well, but it did not work for me. However, I found out it can also be defined in the WebUI.

JMora77 avatar May 07 '24 13:05 JMora77

Could you guide me where did you found thoses settings?

norbertgrz avatar May 09 '24 06:05 norbertgrz

@norbertgrz, I ran into the same issue as you and stumbled upon this thread. I think @JMora77 means something along the lines of this which worked for me:

---
apiVersion: v1
kind: LimitRange
metadata:
  name: awx
  namespace: awx
spec:
  limits:
  - type: Container
    default:
      cpu: "300m"
      memory: "400Mi"

Guent4 avatar May 17 '24 23:05 Guent4

Also not sure if this page is of any use: https://ansible.readthedocs.io/projects/awx-operator/en/latest/user-guide/advanced-configuration/containers-resource-requirements.html

Guent4 avatar May 19 '24 23:05 Guent4

@norbertgrz, I ran into the same issue as you and stumbled upon this thread. I think @JMora77 means something along the lines of this which worked for me:

---
apiVersion: v1
kind: LimitRange
metadata:
  name: awx
  namespace: awx
spec:
  limits:
  - type: Container
    default:
      cpu: "300m"
      memory: "400Mi"

This is set already, but it is only workaround for me. I have no cluster admin permissions, so controlling limitRange is out of my scope.

norbertgrz avatar May 28 '24 09:05 norbertgrz

What if there is an OPA Gatekeeper policy enforcing resource requests? The spec for the migration job should be editable

bewing avatar Jun 14 '24 17:06 bewing

Looks like this issue was solved by this commit https://github.com/ansible/awx-operator/commit/041270ffbe7d27cc5dcc698f3e8d116a3f924b83

norbertgrz avatar Jul 18 '24 13:07 norbertgrz