steps-xcode-test icon indicating copy to clipboard operation
steps-xcode-test copied to clipboard

Bitrise reports test suite as passing, despite failures, when using `retry_on_failure`

Open jessesquires opened this issue 4 years ago β€’ 10 comments

Troubleshooting

  • [x] I've searched discuss.bitrise.io for possible solutions.
  • Which version of the step is effected? Latest, 4.x
  • Is the issue reproducible with the latest version? YES
  • Does the issue happen sporadically, or every time? EVERY TIME
  • Is the issue reproducible locally by following our local debug guide? YES

Seems slightly related to #187, although that describes a different issue.

Issue description

When using test_repetition_mode: "retry_on_failure", if there are actual test failures, Bitrise reports that all tests have succeeded and the step does not fail.

- xcode-test:
  title: Unit Tests
  inputs:
  - project_path: "$WORKSPACE"
  - scheme: "$SCHEME_UNIT"
  - destination: "$DEST_IPAD"
  - test_repetition_mode: "retry_on_failure"
  - maximum_test_repetitions: 3

Bitrise info

  • Build URL:
    • With retry_on_failure enabled, passes (incorrect): https://app.bitrise.io/build/f046cabd-5961-4807-90b1-d2a5b93c590c#?tab=log
    • With retry_on_failure disabled, fails (correct): https://app.bitrise.io/build/41e9bbcb-9af0-45f5-8a64-b8682589771e#?tab=log
  • Bitrise Support enabled: YES
  • Logs: attaching screenshots below

These are the test results with retry_on_failure enabled (i.e., the configuration above). As you can see, they are incorrectly reported as "succeeded" even though there are failures.

- xcode-test:
  title: Unit Tests
  inputs:
  - project_path: "$WORKSPACE"
  - scheme: "$SCHEME_UNIT"
  - destination: "$DEST_IPAD"
  # RETRY ON FAILURE
  - test_repetition_mode: "retry_on_failure"
  - maximum_test_repetitions: 3
Screen Shot 2021-10-20 at 1 57 02 PM

These are the test results without specifying test_repetition_mode and maximum_test_repetitions. As you can see, the failure is correctly reported.

- xcode-test:
  title: Unit Tests
  inputs:
  - project_path: "$WORKSPACE"
  - scheme: "$SCHEME_UNIT"
  - destination: "$DEST_IPAD"
  # DO NOT RETRY
Screen Shot 2021-10-20 at 1 57 41 PM

Steps to reproduce

  1. Create an Xcode project with unit tests
  2. Write tests that fail
  3. Add the following to your xcode-test step:
  - test_repetition_mode: "retry_on_failure"
  - maximum_test_repetitions: 3
  1. xcode-test reports that test succeeded, despite having failures.
  2. Remove test_repetition_mode and maximum_test_repetitions
  3. Tests now report as failed.

jessesquires avatar Oct 20 '21 21:10 jessesquires

Update:

I tested just using plain xcodebuild and the results were as expected, failures reported correctly.

xcodebuild test -project MyApp.xcodeproj -scheme MyApp -destination 'platform=iOS Simulator,name=iPhone 13,OS=15.0' -test-iterations 3 -retry-tests-on-failure

So, this appears to be an issue specifically with the xcode-test step on Bitrise.

jessesquires avatar Oct 20 '21 22:10 jessesquires

Hey @jessesquires! πŸ‘‹

Thanks for reporting this issue to us!

In order to take a deeper look into the issue, we'd like to ask for two additional build logs:

  • One similar to the first one, but only with a single Xcode Test step (with Test Repetition enabled that should have failed) so that uploaded artifacts are not overridden.
  • One where you run plain xcodebuild, just as you described above.

Thanks in advance!

Bence1001 avatar Oct 22 '21 09:10 Bence1001

Hey @Bence1001 -- I will try to get this to you as soon as I can, but it will take some time to recreate those specific builds since the project has continued to move forward.

jessesquires avatar Oct 25 '21 17:10 jessesquires

Hey @jessesquires -- We noticed something in your test logs: after each failed test case, the next (passing) test case name is the same as the failing one.

    βœ— HSParameterPrototypeSpec_PrototypeCollection_TheParamIsASetTextParam_ShouldReturnTheCollectionOfNumbers_Math_VariablesAndTraits, expected subject to equal [...]
    βœ“ HSParameterPrototypeSpec_PrototypeCollection_TheParamIsASetTextParam_ShouldReturnTheCollectionOfNumbers_Math_VariablesAndTraits (0.003 seconds)

I think this is the retry mechanism in action, and although I don't see what each test case is doing, they probably fail on the first try and succeed on the second.

ofalvai avatar Nov 03 '21 07:11 ofalvai

I think this is the retry mechanism in action, and although I don't see what each test case is doing, they probably fail on the first try and succeed on the second.

Oh, that's super interesting. Unfortunately, this wasn't the case for us. When I ran tests locally, they consistently failed β€” thus the reason I opened this issue. It was a legitimate test failure that slipped through on a PR where a code change actually broke functionality.

One new thought comes to mind: these specific tests (and most of the test suite) use Quick and Nimble instead of plain XCTestCase, which I know sometimes have issues integrating nicely with Xcode. Perhaps this was part of the problem.


I will try to get this to you as soon as I can, but it will take some time to recreate those specific builds since the project has continued to move forward.

Circling back to my previous comment above. I apologize for not getting these specific builds+logs ready. Unfortunately, it is more unlikely to happen now. I'm now starting some time-off (this was a client project for me as a contractor), so I won't be able to revisit this in a reasonable amount of time. I will, however, try to provide these builds+logs when I can, if you would like to leave this issue open.

jessesquires avatar Nov 03 '21 18:11 jessesquires

No worries and thank you for taking the time to investigate this so far. I'll keep this issue open, hopefully someone else affected can chime in.

ofalvai avatar Nov 05 '21 10:11 ofalvai

Hello there, I'm a bot. On behalf of the community I thank you for opening this issue.

To help our human contributors focus on the most relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 90 days, so I marked it as stale.

The community would appreciate if you could check if the issue still persists. If it isn't, please close it. If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me".

If no comment left within 21 days, this issue will be closed.

bitrise-coresteps-bot avatar Feb 04 '22 08:02 bitrise-coresteps-bot

I think this is the retry mechanism in action, and although I don't see what each test case is doing, they probably fail on the first try and succeed on the second.

@ofalvai @Bence1001 We can confirm that this behavior happens for us when we use retry_on_failure however it's hard to deduce this from looking at Bitrise due to the Test Report Bitrise integration not reflecting the fact the test may have passed on a latter (repeated) run. In other words, given

test_1 fails
...
test_1 reruns and passes

the Test Report Bitrise integration shows test_1 as having failed. @jessesquires this is probably why you said this?

As you can see, they are incorrectly reported as "succeeded" even though there are failures.

This sounds like a bug with the Test Reports (or this step; not sure) that ought to be fixed given the confusion this may cause for anyone trying to identify actual failures in a build.

OlivΓ©r, here is a link to a sample build log that exhibits the behavior we've noted above: https://app.bitrise.io/build/8133b16d-106e-45b8-88f5-1ab93253a90c#?tab=log

cc @enlivn

SaiKhal avatar Feb 16 '22 19:02 SaiKhal

Hey @SaiKhal πŸ‘‹πŸΌ

In our case, there was actually a legitimate test failure. That is, when I ran tests locally on my machine, they failed (as expected, because an earlier change broke the tests.) The problem was that Bitrise, somehow, reported success.

However, to your point, I agree that the reports/logs are confusing if a test fails and then succeeds.


Related: a nice feature would be for Bitrise to actively identify and report flakey tests. πŸ˜„ cc @ofalvai

jessesquires avatar Feb 16 '22 20:02 jessesquires

@jessesquires regarding your comment:

Related: a nice feature would be for Bitrise to actively identify and report flakey tests.

Insights Pro provides this feature already

image

See: https://blog.bitrise.io/post/introducing-build-insights-pro-for-build-and-test-analytics

DamienBitrise avatar Aug 25 '22 02:08 DamienBitrise