Opentelemetry stops sending traces with 9.0.0
Summary
When going from 8.6.2 to 9.0.0 the opentelemetry callback stops sending traces to the endpoint. Same exact configuration and traces get forwarded in 8.6.2 but go nowhere in 9.0.0. I suspect it's due to how the exporter is getting picked but can't seem to figure out how to make it work.
otel_exporter = None
if store_spans_in_file:
otel_exporter = InMemorySpanExporter()
processor = SimpleSpanProcessor(otel_exporter)
else:
if otel_exporter_otlp_traces_protocol == 'grpc':
otel_exporter = GRPCOTLPSpanExporter()
else:
otel_exporter = HTTPOTLPSpanExporter()
processor = BatchSpanProcessor(otel_exporter)
Issue Type
Bug Report
Component Name
opentelemetry callback
Ansible Version
$ ansible --version
ansible [core 2.17.1]
config file = /ansible/ansible.cfg
configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.12/site-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.12.3 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)] (/usr/bin/python3)
jinja version = 3.1.4
libyaml = True
Community.general Version
$ ansible-galaxy collection list community.general
Collection Version
----------------- -------
community.general 9.1.0
Configuration
$ ansible-config dump --only-changed
OS / Environment
No response
Steps to Reproduce
ansibile config: [defaults] callbacks_enabled = community.general.opentelemetry
Run playbook OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 ansible-playbook playbook.yml
Expected Results
Expect traces to be sent to endpoint
Actual Results
Traces are never forwarded
Code of Conduct
- [X] I agree to follow the Ansible Code of Conduct
Files identified in the description:
If these files are incorrect, please update the component name section of the description or use the !component bot command.
cc @v1v click here for bot help
https://github.com/ansible-collections/community.general/pull/8321 is the PR that introduced the support for the http exporter.
As far as I see, the change uses the same exporter by default.
Can you try to run the plugin with the explicit configuration entries?
ansible.cfg:
[defaults]
callbacks_enabled = community.general.opentelemetry
[callback_opentelemetry]
otel_exporter_otlp_traces_protocol = grpc
store_spans_in_file = None
IIUC, you tried locally running OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317/ ansible-playbook playbook.yml against your OTEL collector, right?
Tried setting all of the possible config options to their defaults in the ansible.cfg and still the same issue, it just simply isn't trying to send the traces. If I do a store_spans_in_file=/dev/stdout instead just to see, it prints them to the screen, so I know it's tracing, it's just for some reason not sending to the otlp endpoint...
Seeing the same issue here. Works nicely in 8.6, but silently stops sending traces in >=9.0.0.
I can see a few changes were added to v9.0:
- https://github.com/ansible-collections/community.general/blob/stable-9/CHANGELOG.md#v9-0-0
IIUC, from the description, the issue might be related to supporting HTTP exporters and the existing GRPC support.
@wilfriedroset @russoz, since you worked and helped on https://github.com/ansible-collections/community.general/pull/8321, would you mind if I asked you to double-check if things work nicely on your end if you use >=9.0.0? 🙇
tested with 9.2.0. problem persist
8.6.3 works ok
ansible [core 2.14.14] config file = /home/cervenka/.ansible.cfg configured module search path = ['/home/cervenka/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python3.9/site-packages/ansible ansible collection location = /home/cervenka/.ansible/collections:/usr/share/ansible/collections executable location = /usr/bin/ansible python version = 3.9.18 (main, Jan 4 2024, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] (/usr/bin/python3) jinja version = 3.1.2 libyaml = True
I can see a few changes were added to v9.0:
IIUC, from the description, the issue might be related to supporting HTTP exporters and the existing GRPC support.
@wilfriedroset @russoz, since you worked and helped on #8321, would you mind if I asked you to double-check if things work nicely on your end if you use >=
9.0.0? 🙇
Hi @v1v I pretty much helped review it from a Python/Ansible perspective, I am not familiar enough with OpenTelemetry to make a call on the plugin logic.
@wilfriedroset Would it be possible for you to double check the code change? TIA
I have just reviewed the changes in that PR, and to the best of my ability I could not find anything that would be a problem. There are 4 other PRs after #8321 that might have introduced a problem (I have no x-ref' d them with the version tag, so probably not all of them apply).
I've merged #8741, would be great if someone could verify that it fixes this bug.
@felixfontein
with this version we only get the trace without any spans. if we use the community.general < 9.0.0 we have all the spans correctly reported.
@wilfriedroset @v1v ^
friendly push if someone has any pointer to the cause of this?
@wilfriedroset @v1v
Sorry for the radio silence;
I cannot reproduce the missing traces/spans error with the latest changes in main.
How did I test this out?
I've been using the latest changes for the otel ansible plugin and testing against an OTEL Collector that has been configured with the Elastic exporter
OTEL collector config
receivers:
otlp:
protocols:
grpc:
http:
exporters:
otlp/elastic:
endpoint: "${env:APM_URL}"
headers:
Authorization: "Bearer ${env:APM_TOKEN}"
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/elastic]
logs:
receivers: [otlp]
exporters: [otlp/elastic]
Then I ran docker compose with the below settings:
docker-compose.yml
---
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otel-collector-config.yaml"]
platform: linux/arm64
volumes:
- ./config/otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "1888:1888" # pprof extension
- "13133:13133" # health_check extension
- "4317:4317" # OTLP gRPC receiver
- "55670:55679" # zpages extension
environment:
APM_URL: ${APM_URL}
APM_TOKEN: ${APM_TOKEN}
networks:
- otel
volumes:
otel:
driver: local
networks:
otel:
and ran:
$ OTEL_EXPORTER_OTLP_INSECURE=true \
OTEL_EXPORTER_OTLP_ENDPOINT=localhost:4317 \
ansible-playbook playbook.yml
and so far so good in both cases OTEL_EXPORTER_OTLP_ENDPOINT=localhost:4317 and OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317.
My current environment is:
Expand to view
ansible [core 2.16.6]
config file = /Users/vmartinez/workspaces/v1v/its-ansible-otel/ansible.cfg
configured module search path = ['/Users/vmartinez/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /Users/vmartinez/workspaces/v1v/its-ansible-otel/.venv/lib/python3.12/site-packages/ansible
ansible collection location = /Users/vmartinez/.ansible/collections:/usr/share/ansible/collections
executable location = /Users/vmartinez/workspaces/v1v/its-ansible-otel/.venv/bin/ansible
python version = 3.12.8 (main, Dec 3 2024, 18:42:41) [Clang 16.0.0 (clang-1600.0.26.4)] (/Users/vmartinez/workspaces/v1v/its-ansible-otel/.venv/bin/python)
jinja version = 3.1.4
libyaml = True
Package Version
---------------------------------------- --------
ansible 9.5.1
ansible-core 2.16.6
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
cryptography 42.0.7
Deprecated 1.2.14
docker 7.0.0
googleapis-common-protos 1.63.0
grpcio 1.63.0
idna 3.7
importlib-metadata 7.0.0
iniconfig 2.0.0
Jinja2 3.1.4
MarkupSafe 2.1.5
opentelemetry-api 1.24.0
opentelemetry-exporter-otlp 1.24.0
opentelemetry-exporter-otlp-proto-common 1.24.0
opentelemetry-exporter-otlp-proto-grpc 1.24.0
opentelemetry-exporter-otlp-proto-http 1.24.0
opentelemetry-proto 1.24.0
opentelemetry-sdk 1.24.0
opentelemetry-semantic-conventions 0.45b0
packaging 24.0
pip 24.0
pluggy 1.5.0
protobuf 4.25.3
pycparser 2.22
pytest 8.2.0
PyYAML 6.0.1
requests 2.31.0
resolvelib 1.0.1
typing_extensions 4.11.0
urllib3 2.2.1
wrapt 1.16.0
zipp 3.18.1
"org.opencontainers.image.source": "https://github.com/open-telemetry/opentelemetry-collector-releases",
"org.opencontainers.image.version": "0.100.0"
If I update those dependencies, it works too:
Expand to view
ansible [core 2.18.1]
config file = /Users/vmartinez/workspaces/v1v/its-ansible-otel/ansible.cfg
configured module search path = ['/Users/vmartinez/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /Users/vmartinez/workspaces/v1v/its-ansible-otel/.venv/lib/python3.12/site-packages/ansible
ansible collection location = /Users/vmartinez/.ansible/collections:/usr/share/ansible/collections
executable location = /Users/vmartinez/workspaces/v1v/its-ansible-otel/.venv/bin/ansible
python version = 3.12.8 (main, Dec 3 2024, 18:42:41) [Clang 16.0.0 (clang-1600.0.26.4)] (/Users/vmartinez/workspaces/v1v/its-ansible-otel/.venv/bin/python)
jinja version = 3.1.5
libyaml = True
Package Version
---------------------------------------- ----------
ansible 11.1.0
ansible-core 2.18.1
certifi 2024.12.14
cffi 1.17.1
charset-normalizer 3.4.1
cryptography 44.0.0
Deprecated 1.2.15
googleapis-common-protos 1.66.0
grpcio 1.68.1
idna 3.10
importlib_metadata 8.5.0
iniconfig 2.0.0
Jinja2 3.1.5
MarkupSafe 3.0.2
opentelemetry-api 1.29.0
opentelemetry-exporter-otlp 1.29.0
opentelemetry-exporter-otlp-proto-common 1.29.0
opentelemetry-exporter-otlp-proto-grpc 1.29.0
opentelemetry-exporter-otlp-proto-http 1.29.0
opentelemetry-proto 1.29.0
opentelemetry-sdk 1.29.0
opentelemetry-semantic-conventions 0.50b0
packaging 24.2
pip 24.3.1
pluggy 1.5.0
protobuf 5.29.2
pycparser 2.22
pytest 8.3.4
PyYAML 6.0.2
requests 2.32.3
resolvelib 1.0.1
typing_extensions 4.12.2
urllib3 2.3.0
wrapt 1.17.0
zipp 3.21.0
If you'd like to reuse what I've done, https://github.com/v1v/otel-ansible-callback-plugin/pull/2 might help you - you can configure another OTEL vendor.
Please let me know if you can provide what vendors you can see it's not working
However, if I use the latest container (0.116.1) for the OTEL Collector :
"org.opencontainers.image.created": "2024-12-17T21:09:34Z",
"org.opencontainers.image.licenses": "Apache-2.0",
"org.opencontainers.image.name": "opentelemetry-collector-releases",
"org.opencontainers.image.revision": "62dfc10402322ae4e2cdbdd92a0c0cc797f1b1f4",
"org.opencontainers.image.source": "https://github.com/open-telemetry/opentelemetry-collector-releases",
"org.opencontainers.image.version": "0.116.1"
Then the same setup it's not working:
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 8s.
Regardless, https://github.com/ansible-collections/community.general/blob/main/plugins/callback/opentelemetry.py works fine if I use OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS without the OTEL Collector itself:
OTEL_EXPORTER_OTLP_INSECURE=true \
OTEL_EXPORTER_OTLP_ENDPOINT=https://*****.elastic-cloud.com:443 \
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer *****" \
ansible-playbook playbook.yml
[...]
PLAY RECAP *********************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
We can close this issue. So far I have not been able to reproduce the issue after the fix done at https://github.com/ansible-collections/community.general/issues/8566#issuecomment-2283148803
@moserke any objection to that?
needs_info
Thanks @russoz. Sounds good to me. My apologies for missing all of these.
it started to work again with one of latest versions. looks good here too.
@v1v since you're a maintainer for this plugin you can write close_me in a comment to make the bot close the issue. (https://github.com/ansible/ansibullbot/blob/devel/ISSUE_HELP.md#commands) (I won't close it now so you can try it out ;-) )
close_me