community.general icon indicating copy to clipboard operation
community.general copied to clipboard

[opentelemetry][callback] Add support for http exporter

Open wilfriedroset opened this issue 1 year ago • 6 comments

SUMMARY

The previous version of the callback was supporting only the grpc exporter. This was counter intuitive as the documentation was mentioning <your endpoint (OTLP/HTTP)>. Users were left with a error similar to Transient error StatusCode.UNAVAILABLE encountered while exporting traces to <endpoint>, retrying in 1s.

The following commit fix this situation by support both HTTP and GRPC via the standard environment variables and ansible.cfg

Fixes #7888

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

community.general.opentelemetry

ADDITIONAL INFORMATION

To reproduce and validate the correct behavior Consider the following directory setup

$ tree
.
├── ansible.cfg
├── ansible-execution_environment
│   └── Dockerfile
├── docker-compose.yaml
├── opentelemetry.py
└── test.yaml

2 directories, 5 files

Set ansible.cfg

$ cat ansible.cfg
[defaults]
callbacks_enabled = community.general.opentelemetry
[callback_opentelemetry]
enable_from_environment = ANSIBLE_OPENTELEMETRY_ENABLED

Set a sample playbook:

$ cat test.yaml
- hosts: localhost
  connection: local
  gather_facts: false
  tasks:
    - name: Print the gateway for each host when defined
      ansible.builtin.debug:
        msg: Test

The execution environment is the following, please note that any environment with ansible and the correct python deps should work as well

$ cat ansible-execution_environment/Dockerfile
FROM quay.io/ansible/awx-ee:23.9.0

USER root

RUN pip install --no-cache-dir \
  opentelemetry-api==1.24.0 \
  opentelemetry-exporter-otlp==1.24.0 \
  opentelemetry-sdk==1.24.0

USER 1000

Here is the docker-compose file

$ cat docker-compose.yaml
services:
  jaeger:
    image: jaegertracing/all-in-one:1.56
    container_name: jaeger
    ports:
      - 45400:16686
    environment:
      - COLLECTOR_OTLP_ENABLED=true
  ansible:
    image: reproducer
    build: ./ansible-execution_environment
    container_name: ansible
    volumes:
      - ./:/mnt
      # - ./opentelemetry.py:/usr/share/ansible/collections/ansible_collections/community/general/plugins/callback/opentelemetry.py # Mount the patched file in the correct directory
    environment:
      - ANSIBLE_OPENTELEMETRY_ENABLED=true
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318
      - OTEL_SERVICE_NAME=testing-ansible-tracing
    entrypoint: sleep 3000

Build the image

docker-compose build

Start the stack

docker-compose up

Now enter the execution environment

docker exec -it -u root -w /mnt ansible /bin/bash

Witness ansible complaining about the error before the patch

# ansible-playbook test.yaml
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match
'all'

PLAY [localhost] *****************************************************************************************************

TASK [Print the gateway for each host when defined] ******************************************************************
ok: [localhost] => {
    "msg": "Test"
}

PLAY RECAP ***********************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 8s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 16s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to jaeger:4318, retrying in 32s.

After the file is patched and the correct variable is exported there is no error the trace is visible in the backend

# export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf
# ansible-playbook test.yaml
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match
'all'

PLAY [localhost] *****************************************************************************************************

TASK [Print the gateway for each host when defined] ******************************************************************
ok: [localhost] => {
    "msg": "Test"
}

PLAY RECAP ***********************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

wilfriedroset avatar May 06 '24 13:05 wilfriedroset

cc @v1v click here for bot help

ansibullbot avatar May 06 '24 13:05 ansibullbot

Ping @mihai-satmarean this should fix your error. Ping @v1v for the review maybe 🙏🏾

wilfriedroset avatar May 06 '24 13:05 wilfriedroset

Ping @mihai-satmarean this should fix your error. Ping @v1v for the review maybe 🙏🏾

Hi, thank you! I will find some time these days!

mihai-satmarean avatar May 06 '24 14:05 mihai-satmarean

Thanks @wilfriedroset , that's super great. I'll run some manual tests to validate your changes on my end this week (likely by the end of the week)

v1v avatar May 06 '24 18:05 v1v

Thank for the quick review @felixfontein, I've taken into account all your comments.

wilfriedroset avatar May 06 '24 20:05 wilfriedroset

LGTM (although I am not familiar with the callback nor I use OT at the moment)

russoz avatar May 12 '24 05:05 russoz

@v1v did you have a chance to look at this?

felixfontein avatar May 15 '24 05:05 felixfontein

@v1v did you have a chance to look at this?

I've been working on setting a test environment and it took me longer. I'm gonna run some tests shortly, thanks for your patient 🙏

v1v avatar May 15 '24 09:05 v1v

All good, I did run a few manual executions. Thanks so much @wilfriedroset

v1v avatar May 15 '24 16:05 v1v

@wilfriedroset thanks a lot for your contribution! @v1v @russoz thanks a lot for reviewing and testing this!

felixfontein avatar May 15 '24 16:05 felixfontein