provider-ansible
provider-ansible copied to clipboard
AnsibleRun Support in Composition/XR
What happened?
I can create an AnsibleRun resource without issue and run an inline ansible playbook. However, I'm unable to add the same AnsibleRun resource as a part of a larger crossplane composition/XR. Should it be possible to use AnsibleRun within an XR?
How can we reproduce it?
Creating an AnsibleRun Resources like the following works without issue:
apiVersion: ansible.crossplane.io/v1alpha1
kind: AnsibleRun
metadata:
name: ansible-example
spec:
forProvider:
playbookInline: |
---
- hosts: localhost
tasks:
- name: ansibleplaybook-example
debug:
msg: Your are running 'ansibleplaybook-example' example
providerConfigRef:
name: provider-ansible
When adding the same resource to an XR like below, the other resources (EC2 and SecurityGroup) in the composition are created, but the ansiblerun resource is not created:
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: admin-server
labels:
crossplane.io/xrd: xadmininstances.aws.hades.org
provider: provider-aws
spec:
writeConnectionSecretsToNamespace: crossplane-system
compositeTypeRef:
apiVersion: hades.org/v1alpha1
kind: XAdminInstance
resources:
- name: securitygroup
base:
apiVersion: ec2.aws.crossplane.io/v1beta1
kind: SecurityGroup
spec:
forProvider:
region: us-east-1
vpcId: vpc-0186b862b83f5cd71
description: Admin server for Environment
ingress:
- fromPort: 0
toPort: 65535
ipProtocol: tcp
ipRanges:
- cidrIp: 10.77.77.20/32
- fromPort: 22
toPort: 22
ipProtocol: tcp
ipRanges:
- cidrIp: 10.77.77.10/32
- fromPort: 22
toPort: 22
ipProtocol: tcp
ipRanges:
- cidrIp: 166.28.0.0/16
- fromPort: 443
toPort: 443
ipProtocol: tcp
ipRanges:
- cidrIp: 0.0.0.0/32
- fromPort: 8080
toPort: 8084
ipProtocol: tcp
ipRanges:
- cidrIp: 0.0.0.0/32
- fromPort: 80
toPort: 80
ipProtocol: tcp
ipRanges:
- cidrIp: 0.0.0.0/0
providerConfigRef:
name: provider-aws
patches:
- type: FromCompositeFieldPath
fromFieldPath: "metadata.name"
toFieldPath: "spec.forProvider.groupName"
- name: admin-instance
base:
apiVersion: ec2.aws.crossplane.io/v1alpha1
kind: Instance
spec:
forProvider:
region: us-east-1
imageId: ami-02ae903c0b1d9fd12
instanceType: t3.medium
keyName: hades-key
blockDeviceMappings:
- deviceName: /dev/sdx
ebs:
volumeType: gp3
subnetId: subnet-08d3a539398176845
securityGroupSelector:
matchControllerRef: true
tags:
- key: Name
value: somogyi-admin
providerConfigRef:
name: provider-aws
patches:
- type: FromCompositeFieldPath
fromFieldPath: "spec.parameters.storageGB"
toFieldPath: "spec.forProvider.blockDeviceMappings[0].ebs.volumeSize"
- name: ansibleconfig
base:
apiVersion: ansible.crossplane.io/v1alpha1
kind: AnsibleRun
spec:
forProvider:
playbookInline: |
---
- hosts: localhost
tasks:
- name: ansibleplaybook-example
debug:
msg: Hello world!
providerConfigRef:
name: provider-ansible
What environment did it happen in?
Crossplane version: 1.10.1
provider-ansible:v0.4.0
provider-aws:v0.33.0
- Cloud provider or hardware configuration: AWS
- Kubernetes version (use
kubectl version
): 1.21.1 - Kubernetes distribution (e.g. Tectonic, GKE, OpenShift): RKE2
- OS (e.g. from /etc/os-release): RHEL 8.7
- Kernel (e.g.
uname -a
): Linux X86_64
FWIW, I'm also using AnsibleRun
in a Composition
and after some troubleshooting (just started learning / using Crossplane this week), I was able to get AnsibleRun
to work as long as I don't specify a name for this resource or any others in the Composition
and patch in the Claim namespace to AnsibleRun
:
<snip>
- base:
apiVersion: ansible.crossplane.io/v1alpha1
kind: AnsibleRun
spec:
forProvider:
vars:
ansible_ssh_user: root
ansible_ssh_private_key_file: ./ssh_id
ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
rke2_download_kubeconf: True
rke2_download_kubeconf_path: /tmp/
rke2_cni: cilium
rke2_token: <redacted>
roles:
- name: lablabs.rke2
src: lablabs.rke2
providerConfigRef:
name: default
writeConnectionSecretToRef:
name: rke2-install
namespace: upbound-system
patches:
- fromFieldPath: spec.claimRef.namespace
toFieldPath: metadata.namespace
- fromFieldPath: spec.claimRef.name
toFieldPath: metadata.name
- type: CombineFromComposite
combine:
variables:
- fromFieldPath: status.master0IPAddress
- fromFieldPath: status.master1IPAddress
- fromFieldPath: status.master2IPAddress
strategy: string
string:
fmt: '{"all":{"children":{"k8s_cluster":{"children":{"masters":{"hosts":{"%s":null,"%s":null,"%s":null}}}}}}}'
toFieldPath: spec.forProvider.inventoryInline
I'm using the image built from the main branch.
Thanks @AshleyDumaine. That got it working for me!! Really appreciate the response.
I was able to get AnsibleRun to work as long as I don't specify a name for this resource or any others in the Composition and patch in the Claim namespace to AnsibleRun
@ride808 @fahedouch Do you agree that AnsibleRun
should be usable within Compositions w/out the restrictions described by @AshleyDumaine above?
If so, should this issue be formally re-opened to request removal of the restrictions described above? Or, should this issue remain closed and the following issues be opened instead?
- Allow
AnsibleRun
to be used inCompositions
that specify names for resources. - Allow
AnsibleRun
to be used inCompositions
without specifying a namespace.
@ron1 would you please create an issue ticket with these informations and then reclose this one. Thks
@AshleyDumaine @fahedouch One last question. In Ashley's snippet above the var: ansible_ssh_private_key_file: ./ssh_id
is set in the forProvider section of the AnsibleRun Resource. I assume this tells the provider to use that private key when executing the role/playbook, etc in the provider pod. However, how do you get that private key of your choosing in the provider pod? I'm getting ssh unreachable errors and I'm guessing that it's because there is no ssh key in the provider container:
[ec2-user@ip-166-28-20-77 ~]$ kubectl exec -it provider-ansible-325ec633e2d4-67db7b66fc-5tr66 -n crossplane-system sh
$ ps auxx
PID USER TIME COMMAND
1 ansible 21:50 crossplane-ansible-provider
593 ansible 1:02 [ansible-playboo]
1283 ansible 0:15 {ansible-playboo} /usr/bin/python3 /usr/bin/ansible-playbook -e @/ansibleDir/19ecb472-e7bf-417f-80e0-9a2d6632da0e/env/extravars playbook.yml
1344 ansible 0:00 {ansible-playboo} /usr/bin/python3 /usr/bin/ansible-playbook -e @/ansibleDir/19ecb472-e7bf-417f-80e0-9a2d6632da0e/env/extravars playbook.yml
1492 ansible 0:00 sh
1498 ansible 0:00 ps auxx
$ cat ansibleDir/19ecb472-e7bf-417f-80e0-9a2d6632da0e/env/extravars
{"ansible_provider_meta":{"somogyi-admin":{"state":"present"}},"ansible_ssh_common_args":"-o StrictHostKeyChecking=no","ansible_ssh_private_key_file":"./ssh_id","ansible_ssh_user":"root"}/ $
It seems I'm definitely misunderstanding something here and how a provider is configured to allow connections to your provisioned managed resources. Where should ./ssh_id be coming from? Please let me know if you'd like me to open this as a question/new issue.
@ride808 See this comment which might help.
@ride808 I was using a ProviderConfig
with a Secret
holding the private key. This ProviderConfig
is referenced in my snippet's providerConfigRef
.
Example (that puts the private key at ./ssh_id
):
apiVersion: ansible.crossplane.io/v1alpha1
kind: ProviderConfig
metadata:
name: default
namespace: upbound-system
spec:
credentials:
- filename: ssh_id
source: Secret
secretRef:
key: <key-name>
name: <secret-name>
namespace: upbound-system
we also support inventory if this can help
@AshleyDumaine @fahedouch How do you manage dependencies. In my composition, my ansiblerun resource executes against the ec2 instance I provisioned in the same composition with an inline inventory. But ansiblerun is created in parallel with the instance resource so I get Unreachable errors killing my playbook. After the instance goes to a running state I can delete the anislberun resource in my cluster and let it re-create and then the playbook executes and completes. I've been unsuccessful at using wait_for_connection in the playbook as it just hangs and never exits. Did you encounter this? Any tips? Also - the ansiblerun resource didn't seem to try again after the failed playbook (should it be?). The only way I could get the playbook to rerun was by killing the ansiblerun resource and letting crossplane re-create it.
@ride808 I believe this issue is relevant: https://github.com/crossplane/crossplane/issues/2072
Specifically for AnsibleRun
operating against newly provisioned Crossplane-managed VMs in the Composition
, I've had to add the following to get it to work more reliably:
ansible_ssh_common_args: '-o StrictHostKeyChecking=no -o ConnectionAttempts=10 -o ConnectTimeout=60'
Hmm. That didn't seem to do the trick either. I can see the options getting passed in to the ssh call:
712 ansible 0:00 {ansible-runner} /usr/bin/python3.8 /usr/bin/ansible-runner run ansibleDir/71cdbead-151d-458f-b5be-c4d8c0562d6c -p playbook.yml
714 ansible 0:05 {ansible-playboo} /usr/bin/python3 /usr/bin/ansible-playbook -e @/ansibleDir/71cdbead-151d-458f-b5be-c4d8c0562d6c/env/extravars playbook.yml
723 ansible 0:00 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o IdentityFile="./sshkey" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User="ec2-user" -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ConnectionAttempts=10 -o ConnectTimeout=60 -o ControlPath=/home/ansible/.ansible/cp/144eb13078 166.28.17.129 /bin/sh -c 'echo ~ec2-user && sleep 0'
But after 30 seconds the ansible-playbook and ansible-runner processes in the provider pod just stop and my playbook never finishes. Seems to be the provider prematurely killing the playbook.
Could it be that this isn't yet in a release and is killing my ansible playbook too quickly? https://github.com/crossplane-contrib/provider-ansible/pull/177/
@ride808 fixed by https://github.com/crossplane-contrib/provider-ansible/pull/177, would you please retry with the main branch docker image here
@fahedouch It seens a new 0.4.1 release that contains all the fixes/improvements sitting on the main branch would be very welcome here.
@fahedouch It seens a new 0.4.1 release that contains all the fixes/improvements sitting on the main branch would be very welcome here.
I am planning to release the v0.4.1
by the end of the week.
@ron1 would you please create an issue ticket with these informations and then reclose this one. Thks
Unfortunately I don't have an environment at the moment in which to reproduce these issues. @ride808 would you consider creating two new replacement issues and closing this one?
@fahedouch @ron1 the main image containing #177 did fix my issue. Is there a way to set that timeout? I didn't see any docs with the pull request on how to configure the provider and can see my playbooks taking longer than the default 20m.
@ron1 would you please create an issue ticket with these informations and then reclose this one. Thks
Unfortunately I don't have an environment at the moment in which to reproduce these issues. @ride808 would you consider creating two new replacement issues and closing this one?
@ron1 I'll try to create those two issues against the project today and will close this one when I do.
@ride808
@fahedouch @ron1 the main image containing #177 did fix my issue. Is there a way to set that timeout? I didn't see any docs with the pull request on how to configure the provider and can see my playbooks taking longer than the default 20m.
to override the default timeout or other flags (e.g poll
, ansible-collections-path
etc..) , you have to setup a ControllerConfig
resource with new timeout value (e.g args: ["--timeout","50m"]
. Then reference your ControllerConfig
resource into your Provider
resource using (controllerConfigRef).
not sure if this controllerConfigRef
is dynamically detected for existing Provider
resource. May be you have to redeploy the provider to take effect.
Maybe we should add a FAQ to address these kind of questions!
@ron1 I'll try to create those two issues against the project today and will close this one when I do.
thanks
Any more info on the namespace requirement? Same as @ride808 and @AshleyDumaine, I'm unable to compose an AnsibleRun resource unless I patch in a namespace. Is this intentional? I don't recall having to explicitly set the namespace for other providers (like AWS and Terraform).
I can confirm @AshleyDumaine observations that an AnsibleRun
in a composition only works when:
- the base
name
is not specified -
metadata.name
is specified -
metadata.namespace
is specified
I really wonder how you figured the first one out. This is such a weird issue.
If you don't follow the workaround, the error message is:
cannot generate a name for composed resource "ansible-run": an empty namespace may not be set when a resource name is provided
These workarounds only work in legacy p&t compositions. When trying composition functions, it will result in:
cannot compose resources: cannot update composite resource spec.resourceRefs: failed to create typed patch object (/example-run-rxm4q; crossplane.accenture.com/v1alpha1, Kind=XAnsible): .spec.resourceRefs[0].namespace: field not declared in schema''
Still facing this issue when using cluster-scoped AnsibleRun within a pipeline composition after upgrading to newly released v0.6.0.
---
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: ansibletests.custom-api.example.org
spec:
group: custom-api.example.org
names:
kind: AnsibleTest
plural: ansibletests
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema: {}
---
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: ansibletest
spec:
compositeTypeRef:
apiVersion: custom-api.example.org/v1alpha1
kind: AnsibleTest
mode: Pipeline
pipeline:
- step: run-ansible
functionRef:
name: function-go-templating
input:
apiVersion: gotemplating.fn.crossplane.io/v1beta1
kind: GoTemplate
source: Inline
inline:
template: |
apiVersion: ansible.crossplane.io/v1alpha1
kind: AnsibleRun
metadata:
annotations:
{{ setResourceNameAnnotation "run-ansible" }}
spec:
forProvider:
playbookInline: |
---
- hosts: localhost
tasks:
- name: ansibletest
debug:
msg: This Is A Test
- step: automatically-detect-ready-composed-resources
functionRef:
name: function-auto-ready
---
apiVersion: custom-api.example.org/v1alpha1
kind: AnsibleTest
metadata:
name: my-test
spec: {}
Errors:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SelectComposition 108s defined/compositeresourcedefinition.apiextensions.crossplane.io Successfully selected composition: ansibletest
Normal SelectComposition 108s defined/compositeresourcedefinition.apiextensions.crossplane.io Selected composition revision: ansibletest-9289b63
Warning ComposeResources 46s (x7 over 108s) defined/compositeresourcedefinition.apiextensions.crossplane.io cannot compose resources: cannot generate a name for composed resource "run-ansible": an empty namespace may not be set when a resource name is provided
Is this expected? Is there any known workaround for this?
Thanks
Is the CRD cluster- or namespace-scoped in you Kubernetes cluster?
Hi @janwillies, I'm using the new cluster-scoped that was introduced on v0.6.0.
NAME SHORTNAMES APIVERSION NAMESPACED KIND
ansibleruns ansible.crossplane.io/v1alpha1 false AnsibleRun