netobserv-ebpf-agent
netobserv-ebpf-agent copied to clipboard
NETOBSERV-2455 - Get DNS Name
Description
Parse truncated DNS QNAME and append it to the flows when available
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [ ] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign msherif1234 for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Tested on OCP 4.19 with CLI by adding the following column in the config:
- id: DNSName
group: DNS
name: DNS Name
tooltip: DNS query name.
field: DnsName
filter: dns_name
default: false
width: 15
feature: dnsTracking
New images: quay.io/netobserv/ebpf-bytecode:34362e6 quay.io/netobserv/netobserv-ebpf-agent:34362e6
These will expire after two weeks.
To deploy this build, run from the operator repo, assuming the operator is running:
USER=netobserv VERSION=34362e6 make set-agent-image
/LGTM
@jpinsonneau I've reviewed and overall it looks good to me however I'd just like to understand how wide is the impact of not supporting the compression pointers - in your live tests, did you see it happening often, be it in k8s dns context or when connecting to a public server? Maybe we should introduce error metrics in the decoding, to track when we could not decode ?
@jpinsonneau I've reviewed and overall it looks good to me however I'd just like to understand how wide is the impact of not supporting the compression pointers - in your live tests, did you see it happening often, be it in k8s dns context or when connecting to a public server? Maybe we should introduce error metrics in the decoding, to track when we could not decode ?
I can run some tests and try to get some numbers :wink:
New images: quay.io/netobserv/ebpf-bytecode:b63d880 quay.io/netobserv/netobserv-ebpf-agent:b63d880
These will expire after two weeks.
To deploy this build, run from the operator repo, assuming the operator is running:
USER=netobserv VERSION=b63d880 make set-agent-image
@jpinsonneau I've reviewed and overall it looks good to me however I'd just like to understand how wide is the impact of not supporting the compression pointers - in your live tests, did you see it happening often, be it in k8s dns context or when connecting to a public server? Maybe we should introduce error metrics in the decoding, to track when we could not decode ?
I can run some tests and try to get some numbers 😉
With current implementation on a real OCP cluster:
- 20% are truncated because of
DNS_NAME_MAX_LEN - 1% because of compression pointers. That number is especially low because we can't follow pointers that are outside the truncated bounds
Increasing the max len to 64 helps a bit:
- 17% are truncated because of
DNS_NAME_MAX_LEN - 0.8% because of compression pointers
However, that would highly depend of the workload :thinking:
/lgtm
/test ebpf-node-density-heavy-25nodes
@jpinsonneau: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/qe-e2e-tests | ab1368b0aca4bb58877a7aa034325e30b9b64618 | link | false | /test qe-e2e-tests |
| ci/prow/ebpf-node-density-heavy-25nodes | ab1368b0aca4bb58877a7aa034325e30b9b64618 | link | true | /test ebpf-node-density-heavy-25nodes |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.