[opentelemetry][callback] Add support for http exporter
SUMMARY
The previous version of the callback was supporting only the grpc exporter. This was counter intuitive as the documentation was mentioning <your endpoint (OTLP/HTTP)>. Users were left with a error similar to
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to <endpoint>, retrying in 1s.
The following commit fix this situation by support both HTTP and GRPC via the standard environment variables and ansible.cfg
Fixes #7888
ISSUE TYPE
- Bugfix Pull Request
COMPONENT NAME
community.general.opentelemetry
ADDITIONAL INFORMATION
To reproduce and validate the correct behavior Consider the following directory setup
$ tree
.
├── ansible.cfg
├── ansible-execution_environment
│ └── Dockerfile
├── docker-compose.yaml
├── opentelemetry.py
└── test.yaml
2 directories, 5 files
Set ansible.cfg
$ cat ansible.cfg
[defaults]
callbacks_enabled = community.general.opentelemetry
[callback_opentelemetry]
enable_from_environment = ANSIBLE_OPENTELEMETRY_ENABLED
Set a sample playbook:
$ cat test.yaml
- hosts: localhost
connection: local
gather_facts: false
tasks:
- name: Print the gateway for each host when defined
ansible.builtin.debug:
msg: Test
The execution environment is the following, please note that any environment with ansible and the correct python deps should work as well
$ cat ansible-execution_environment/Dockerfile
FROM quay.io/ansible/awx-ee:23.9.0
USER root
RUN pip install --no-cache-dir \
opentelemetry-api==1.24.0 \
opentelemetry-exporter-otlp==1.24.0 \
opentelemetry-sdk==1.24.0
USER 1000
Here is the docker-compose file
$ cat docker-compose.yaml
services:
jaeger:
image: jaegertracing/all-in-one:1.56
container_name: jaeger
ports:
- 45400:16686
environment:
- COLLECTOR_OTLP_ENABLED=true
ansible:
image: reproducer
build: ./ansible-execution_environment
container_name: ansible
volumes:
- ./:/mnt
# - ./opentelemetry.py:/usr/share/ansible/collections/ansible_collections/community/general/plugins/callback/opentelemetry.py # Mount the patched file in the correct directory
environment:
- ANSIBLE_OPENTELEMETRY_ENABLED=true
- OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318
- OTEL_SERVICE_NAME=testing-ansible-tracing
entrypoint: sleep 3000
Build the image
docker-compose build
Start the stack
docker-compose up
Now enter the execution environment
docker exec -it -u root -w /mnt ansible /bin/bash
Witness ansible complaining about the error before the patch
# ansible-playbook test.yaml
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match
'all'
PLAY [localhost] *****************************************************************************************************
TASK [Print the gateway for each host when defined] ******************************************************************
ok: [localhost] => {
"msg": "Test"
}
PLAY RECAP ***********************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 8s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 16s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 32s.
After the file is patched and the correct variable is exported there is no error the trace is visible in the backend
# export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf
# ansible-playbook test.yaml
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match
'all'
PLAY [localhost] *****************************************************************************************************
TASK [Print the gateway for each host when defined] ******************************************************************
ok: [localhost] => {
"msg": "Test"
}
PLAY RECAP ***********************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
cc @v1v click here for bot help
Ping @mihai-satmarean this should fix your error. Ping @v1v for the review maybe 🙏🏾
Ping @mihai-satmarean this should fix your error. Ping @v1v for the review maybe 🙏🏾
Hi, thank you! I will find some time these days!
Thanks @wilfriedroset , that's super great. I'll run some manual tests to validate your changes on my end this week (likely by the end of the week)
Thank for the quick review @felixfontein, I've taken into account all your comments.
LGTM (although I am not familiar with the callback nor I use OT at the moment)
@v1v did you have a chance to look at this?
@v1v did you have a chance to look at this?
I've been working on setting a test environment and it took me longer. I'm gonna run some tests shortly, thanks for your patient 🙏
All good, I did run a few manual executions. Thanks so much @wilfriedroset
@wilfriedroset thanks a lot for your contribution! @v1v @russoz thanks a lot for reviewing and testing this!