cf-acceptance-tests
cf-acceptance-tests copied to clipboard
Flaky test: [tcp routing] TCP Routing external ports with a second external port [It] maps both ports to the same application
The TCP Routing test that checks if one app can be reached from two ports is failing often here: https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/tcp_routing/tcp_routing.go#L131
Example failures: https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/82 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/57 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/113 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/120
I've recreated the test setup manually on fips/snape. The setup works as expected: You can send data over two different TCP ports to the test app and the app responds as expected. Running the test in the CATs suite however fails often.
I've added some debug statements with timestamps. Here's the flow from a failed run:
# sending first test message to first port
# https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L406
starting SendAndReceive(tcp.cf.snape.env.wg-ard.ci.cloudfoundry.org, 1031) at Jul 11 14:49:45.862
# output from test app: https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/assets/tcp-listener/main.go#L53
# "10.0.32.11" is one of the two tcp-routers
2024-07-11T12:49:45.97+0000 [APP/PROC/WEB/0] OUT Message to 10.0.32.11:41084: server1:Time is 938260798
2024-07-11T12:49:45.99+0000 [APP/PROC/WEB/0] OUT Jul 11 14:49:45.991 (read) Closing connection to 10.0.32.11:41084: EOF
# sending second test message to other port
starting SendAndReceive(tcp.cf.snape.env.wg-ard.ci.cloudfoundry.org, 1026) at Jul 11 14:49:45.955
# now we are failing here when reading the response:
# https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L437
Jul 11 14:54:46.575 error3: EOF
buff is:
When the second message is sent, the conn.Write(message)
statement returns no error:
https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L417
However, the test app doesn't seem to receive the message. There is no "Message to" log statement. What happens next is an error at the conn.Read(buff)
statement:
https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L429
Error is "EOF" and the buffer is empty.
Looks like a race condition. The Read
function is probably called before the test app starts to write and fails immediately with EOF?