collector icon indicating copy to clipboard operation
collector copied to clipboard

TestRepeatedNetworkFlow checks that number of active connections is in expected range

Open JoukoVirtanen opened this issue 1 month ago • 1 comments

Description

The TestRepeatedNetworkFlowWithZeroAfterglowPeriod fails frequently with errors such as

=== RUN   TestRepeatedNetworkFlowWithZeroAfterglowPeriod/TestRepeatedNetworkFlow
    expect_conn.go:72: 
        	Error Trace:	/home/runner/work/collector/collector/integration-tests/pkg/mock_sensor/expect_conn.go:72
        	            				/home/runner/work/collector/collector/integration-tests/suites/repeated_network_flow.go:114
        	Error:      	timed out
        	Test:       	TestRepeatedNetworkFlowWithZeroAfterglowPeriod/TestRepeatedNetworkFlow
        	Messages:   	found 4 connections (expected 3)

This test is currently too strict since the number of observed active connections cannot be guaranteed. While the connection is short lived it is still finite and not instantaneous. Therefore it is possible that the connection will be active during a scrape. This PR checks that the number of active connections observed is in an expected range, whereas currently there is an assert for a specific number of observed active connections.

If the error is that there are four close events reported for the connection, this test will still fail. It is possible that there is a race condition between getting the connection from procfs and syscalls. In one scrape interval the connection might be reported closed because it is obtained from a syscall and in the next scrape interval it might be reported closed, because it is obtained from procfs. The changes here don't fix that issue.

Checklist

  • [ ] Investigated and inspected CI test results
  • [ ] Updated documentation accordingly

Automated testing

  • [ ] Added unit tests
  • [ ] Added integration tests
  • [ ] Added regression tests

If any of these don't apply, please comment below.

Testing Performed

Ran the test locally.

JoukoVirtanen avatar Nov 02 '25 02:11 JoukoVirtanen

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 27.60%. Comparing base (55be868) to head (2b28cd8). :warning: Report is 10 commits behind head on master. :white_check_mark: All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2642   +/-   ##
=======================================
  Coverage   27.60%   27.60%           
=======================================
  Files          95       95           
  Lines        5422     5422           
  Branches     2523     2523           
=======================================
  Hits         1497     1497           
  Misses       3213     3213           
  Partials      712      712           
Flag Coverage Δ
collector-unit-tests 27.60% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Nov 02 '25 02:11 codecov-commenter

I think it's worth adding to the commentaries, that the reason for this uncertainty is that we have information from both scraper and signals at the same time, and leave a TODO to add more tests for scraper/signals only without such uncertainty.

I have created a ticket for integration tests that use procfs only or syscalls only as the source of networking events. https://issues.redhat.com/browse/ROX-31753

The ticket explains why this is needed. A TODO with the ticket and this information has been added to integration_test.go. This information has also been added to the PR description.

JoukoVirtanen avatar Nov 13 '25 18:11 JoukoVirtanen