awx-operator icon indicating copy to clipboard operation
awx-operator copied to clipboard

control_plane_ee_image cannot override the default

Open jpic opened this issue 3 years ago • 11 comments

ISSUE TYPE
  • Bug Report
SUMMARY

_custom_control_plane_ee_image is set after executions_environments.py is rendered.

ENVIRONMENT
  • AWX version: latest
  • Operator version: latest
  • Kubernetes version: latest
  • AWX install method: k8s
STEPS TO REPRODUCE

Set control_plane_ee_image to a non-default value.

EXPECTED RESULTS

That AWX "Control Plane Execution Environment" uses control_plane_ee_image.

ACTUAL RESULTS

"Control Plane Execution Environment" has quay.io/ansible/awx-ee:latest anyway.

ADDITIONAL INFORMATION

This is because AWX uses the value from executions_environments.py.js, which is rendered by app_credentials.yaml.js, which wants to use _custom_control_plane_ee_image as intended.

But, _custom_control_plane_ee_image is set after the templates in question are rendered:

WORK AROUND

One will try setting CONTROL_PLANE_EXECUTION_ENVIRONMENT in custom.py, and indeed, /etc/tower/conf.d/custom.py will have CONTROL_PLANE_EXECUTION_ENVIRONMENT=your/value.

But, /etc/tower/conf.d/execution_environments.py will still contain CONTROL_PLANE_EXECUTION_ENVIRONMENT=quay.io/ansible/awx-ee:latest.

The latter will shadow the former because Tower conf.d/*.py are read one by one by glod in tower settings.

Solution
  • Add CONTROL_PLANE_EXECUTION_ENVIRONMENT=your/value to your custom.py, and also
  • Change mountPath: /etc/tower/conf.d/custom.py with mountPath: /etc/tower/conf.d/z_custom.py,

Or use extra_settings because it should hardcode them at the end of production settings but I've not tested that one.

Don't forget to also set control_plane_ee_image just in case ;)

jpic avatar Dec 07 '21 15:12 jpic

Something still writes the quay.io one on top of my own default control plane ee image at some point after a few redeploys, won't have time to debug it though as the user decided to just specify their own control plane all the time.

It's quite problematic because the default image does not include the community.general collection which contains many modules that were built-in previous ansible releases https://github.com/ansible/awx-ee/issues/65 which means that the default awx-ee image is not expected to work for most use cases (playbooks developed on previous ansible releases)

jpic avatar Dec 08 '21 10:12 jpic

Very much same issue here, I fixed it by attaching execution_environment.py.j2 from operator version 0.14.0 to an operator instance.

GLOBAL_JOB_EXECUTION_ENVIRONMENTS = [
{% for item in ee_images %}
    {'name': '{{ item.name }}' , 'image': '{{ item.image }}'},
{% endfor %}
]
CONTROL_PLANE_EXECUTION_ENVIRONMENT = '{{ control_plane_ee_image }}'

Definitely an issue with ordering, as you mentioned secret is created before custom control plane variable being processed.

This might be a commit introducing problem.

marekzebro avatar Dec 10 '21 11:12 marekzebro

@marekzebro is it sticking for you after a bunch of re-applies on the AWX object?

jpic avatar Dec 10 '21 15:12 jpic

@jpic, it is staying there. As far I know configuration is being read from -app-credentials secret object. execution_environment.py.j2 injects key to this secret. Unfortunately I have additional problem with control plane execution environment not executing project updates on AWX 19.5.0, works completely fine with previous version.

marekzebro avatar Dec 10 '21 15:12 marekzebro

Hello,

Same issue, some news ?

Regards,

JSGUYOT avatar Feb 11 '22 08:02 JSGUYOT

If you are operating in a closed environment that does not have access to the internet, this issue makes it impossible to upgrade beyond 19.4.0.

Until this problem is resolved any upgrade for us is impossible. 19.5.1 exhibits the behaviour talked about in this issue. If you are offline.... you don't have access to the default image locations on the internet, you have to override them. This override is broken after 19.4.0.

jeremyd100 avatar Jun 27 '22 11:06 jeremyd100

Almost two years on. No solution?
We are hitting this in our testing trying to test our deployment for offline/isolated environments. We have control_plane_ee_image set correctly in the AWX deployment spec. The decoded -app-credential secret shows the correct values for the control_plane_ee_image as compared to our AWX deployment spec.

Any ideas where to look next?

lbrigman124 avatar Mar 05 '24 21:03 lbrigman124

Didn't that work for you? https://github.com/ansible/awx-operator/issues/685#issuecomment-990883346 I don't remember the case but reading the issue again makes me believe that a simple fix could be contributed: wherever that glob was running, enforce a load of mountPath after it, to ensure custom values always have the last word. Feel free to mail me if you need any help with that, I would contribute it but don't have access to an environment anymore.

jpic avatar Mar 06 '24 17:03 jpic

We got it figured out. Here are the details for others to follow without needing to fully dig into all the details though; YMMY depending on versions.

Need to create a config map from the execution_environment.py.j2 file above.

kubectl -n <namespace> create configmap cp-ee-config --from-file=execution_environments.py.j2

Modify the AWX controller deployment to use the above config map Sub section of the deployment (volumeMounts section)

          volumeMounts:
            - name: "cp-ee-config"
              mountPath: "/opt/ansible/roles/installer/templates/settings/execution_environments.py.j2"
              subPath: "execution_environments.py.j2"

Volume section

      volumes:
        - name: "cp-ee-config"
          configMap:
            name: "cp-ee-config"

The above changes worked for us based on the https://github.com/ansible/awx-operator/issues/685#issuecomment-990883346 comment above. Versions AWX v23.1.0 AWX Controller v2.5.3

lbrigman124 avatar Mar 06 '24 23:03 lbrigman124