netobserv-ebpf-agent icon indicating copy to clipboard operation
netobserv-ebpf-agent copied to clipboard

Add pkt drop filter support

Open msherif1234 opened this issue 1 year ago • 16 comments

Description

enhance flow filtering to look only for flows with drops kernel has recent drop causes so updated the list to match advanced kernels

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • [ ] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • [ ] Does this PR require product documentation?
    • [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • [ ] Does this PR require a product release notes entry?
    • [ ] If so, fill in "Release Note Text" in the JIRA.
  • [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • [ ] If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • [ ] Standard QE validation, with pre-merge tests unless stated otherwise.
    • [ ] Regression tests only (e.g. refactoring with no user-facing change).
    • [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

msherif1234 avatar Sep 26 '24 11:09 msherif1234

Codecov Report

Attention: Patch coverage is 0% with 14 lines in your changes missing coverage. Please review.

Project coverage is 30.01%. Comparing base (0e1a103) to head (876b155). Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/decode/decode_protobuf.go 0.00% 10 Missing :warning:
pkg/tracer/flow_filter.go 0.00% 1 Missing and 1 partial :warning:
pkg/agent/agent.go 0.00% 1 Missing :warning:
pkg/agent/packets_agent.go 0.00% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #419      +/-   ##
==========================================
- Coverage   30.12%   30.01%   -0.11%     
==========================================
  Files          50       50              
  Lines        4090     4104      +14     
==========================================
  Hits         1232     1232              
- Misses       2752     2765      +13     
- Partials      106      107       +1     
Flag Coverage Δ
unittests 30.01% <0.00%> (-0.11%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/agent/config.go 10.00% <ø> (ø)
pkg/ebpf/bpf_x86_bpfel.go 0.00% <ø> (ø)
pkg/agent/agent.go 33.53% <0.00%> (-0.11%) :arrow_down:
pkg/agent/packets_agent.go 0.00% <0.00%> (ø)
pkg/tracer/flow_filter.go 47.74% <0.00%> (-0.63%) :arrow_down:
pkg/decode/decode_protobuf.go 26.23% <0.00%> (-0.84%) :arrow_down:

codecov[bot] avatar Sep 26 '24 11:09 codecov[bot]

/ok-to-test

msherif1234 avatar Sep 26 '24 11:09 msherif1234

New image: quay.io/netobserv/netobserv-ebpf-agent:257a83c

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=257a83c make set-agent-image

github-actions[bot] avatar Sep 26 '24 11:09 github-actions[bot]

/ok-to-test

msherif1234 avatar Sep 26 '24 13:09 msherif1234

New image: quay.io/netobserv/netobserv-ebpf-agent:00b0880

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=00b0880 make set-agent-image

github-actions[bot] avatar Sep 26 '24 13:09 github-actions[bot]

@msherif1234: This pull request references NETOBSERV-1896 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.

In response to this:

Description

enhance flow filtering to look only for flows with drops

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • [ ] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • [ ] Does this PR require product documentation?
  • [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • [ ] Does this PR require a product release notes entry?
  • [ ] If so, fill in "Release Note Text" in the JIRA.
  • [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • [ ] If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • [ ] Standard QE validation, with pre-merge tests unless stated otherwise.
  • [ ] Regression tests only (e.g. refactoring with no user-facing change).
  • [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Sep 27 '24 11:09 openshift-ci-robot

@msherif1234: This pull request references NETOBSERV-1896 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.

In response to this:

Description

with flow filtering in pkt drops we exceed the max stack size limit and we get validation errors loading and assigning BPF objects: field KfreeSkb: program kfree_skb: load program: permission denied: combined stack size of 2 calls is 544. Too large refactor pkt drop code to be more efficient with stack usage enhance flow filtering to look only for flows with drops

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • [ ] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • [ ] Does this PR require product documentation?
  • [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • [ ] Does this PR require a product release notes entry?
  • [ ] If so, fill in "Release Note Text" in the JIRA.
  • [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • [ ] If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • [ ] Standard QE validation, with pre-merge tests unless stated otherwise.
  • [ ] Regression tests only (e.g. refactoring with no user-facing change).
  • [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Sep 27 '24 11:09 openshift-ci-robot

/ok-to-test

msherif1234 avatar Sep 27 '24 11:09 msherif1234

New image: quay.io/netobserv/netobserv-ebpf-agent:8b1defb

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=8b1defb make set-agent-image

github-actions[bot] avatar Sep 27 '24 11:09 github-actions[bot]

From quick glance at code and from slack discussion, this PR also adds new feature for filtering for PktDrops - which is a new feature. We should include that in a separate PR and only address the bug for refactoring, since it's too late to take changes for new features in 1.7.

memodi avatar Sep 27 '24 14:09 memodi

@msherif1234: No Jira issue is referenced in the title of this pull request. To reference a jira issue, add 'XYZ-NNN:' to the title of this pull request and request another refresh with /jira refresh.

In response to this:

Description

with flow filtering in pkt drops we exceed the max stack size limit and we get validation errors loading and assigning BPF objects: field KfreeSkb: program kfree_skb: load program: permission denied: combined stack size of 2 calls is 544. Too large refactor pkt drop code to be more efficient with stack usage enhance flow filtering to look only for flows with drops

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • [ ] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • [ ] Does this PR require product documentation?
  • [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • [ ] Does this PR require a product release notes entry?
  • [ ] If so, fill in "Release Note Text" in the JIRA.
  • [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • [ ] If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • [ ] Standard QE validation, with pre-merge tests unless stated otherwise.
  • [ ] Regression tests only (e.g. refactoring with no user-facing change).
  • [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Sep 27 '24 16:09 openshift-ci-robot

/ok-to-test

msherif1234 avatar Oct 04 '24 18:10 msherif1234

New image: quay.io/netobserv/netobserv-ebpf-agent:129c71e

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=129c71e make set-agent-image

github-actions[bot] avatar Oct 04 '24 18:10 github-actions[bot]

image

      privileged: true
      features:
      - PacketDrop
      flowFilter:
        enable: true
        pktDrops: true

msherif1234 avatar Oct 04 '24 19:10 msherif1234

/approve

msherif1234 avatar Oct 08 '24 12:10 msherif1234

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msherif1234

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Oct 08 '24 12:10 openshift-ci[bot]