Android icon indicating copy to clipboard operation
Android copied to clipboard

Debug stuck unit test

Open TheLastProject opened this issue 2 years ago • 10 comments

There's an unit test that keeps randomly getting stuck, causing a timeout after 6 hours. It's annoying. Can't reproduce locally, but happens in GitHub actions constantly.

TheLastProject avatar Jun 04 '23 18:06 TheLastProject

Is there any output generated that you can see?

obfusk avatar Jun 04 '23 19:06 obfusk

Anything that might give some kind of clue to what might be causing it?

obfusk avatar Jun 04 '23 19:06 obfusk

Or which specific test gets stuck?

obfusk avatar Jun 04 '23 19:06 obfusk

protect.card_locker.LoyaltyCardViewActivityTest > startWithoutParametersCaptureBarcodeCreateLoyaltyCard STARTED
##[debug]Re-evaluate condition on job cancellation for step: 'Run unit tests'.
Error: The operation was canceled.

https://github.com/CatimaLoyalty/Android/actions/runs/5169203344/jobs/9311309410

TheLastProject avatar Jun 04 '23 20:06 TheLastProject

I've been able to confirm it gets stuck in activityController.visible() in startWithoutParametersCaptureBarcodeCreateLoyaltyCard(). But I have no idea why. Or why only in that specific test.

obfusk avatar Jun 05 '23 18:06 obfusk

So... funny story. After replacing actions/setup-java with apt-get install openjdk-17-jdk-headless I have not yet been able to reproduce this bug on GitHub actions again. Which would explain being unable to reproduce locally.

diff --git a/.github/workflows/android.yml b/.github/workflows/android.yml
index 4a0acb02..1ed3caae 100644
--- a/.github/workflows/android.yml
+++ b/.github/workflows/android.yml
@@ -10,6 +10,9 @@ on:
     branches:
       - main

+env:
+  JAVA_HOME: /usr/lib/jvm/java-17-openjdk-amd64
+
 jobs:
   build:

@@ -20,11 +23,11 @@ jobs:
     - name: Fail on bad translations
       run: if grep -ri "<xliff" app/src/main/res/values*/strings.xml; then echo "Invalidly escaped translations found"; exit 1; fi
     - uses: gradle/wrapper-validation-action@v1
-    - name: set up JDK 17
-      uses: actions/setup-java@v2
-      with:
-        distribution: 'temurin'
-        java-version: '17'
+    - name: set up OpenJDK 17
+      run: |
+        sudo apt-get update
+        sudo apt-get install -y openjdk-17-jdk-headless
+        sudo update-alternatives --auto java
     - name: Build
       run: ./gradlew assembleRelease
     - name: Check lint

obfusk avatar Jun 06 '23 23:06 obfusk

I have not yet been able to reproduce this bug on GitHub actions again.

Sadly, it got stuck again. But after >20 runs now. Whereas before it was about 1/7 runs.

obfusk avatar Jun 07 '23 11:06 obfusk

Looks like https://github.com/CatimaLoyalty/Android/actions/runs/5229783997/jobs/9442968697 got stuck twice and the retry after the timeout failed with:

2023-06-10T12:06:55.6116209Z protect.card_locker.LoyaltyCardViewActivityTest > startWithoutParametersCaptureBarcodeCreateLoyaltyCard STARTED
2023-06-10T14:35:17.0133272Z 
2023-06-10T14:35:17.0134017Z protect.card_locker.LoyaltyCardViewActivityTest > startWithoutParametersCaptureBarcodeCreateLoyaltyCard FAILED
2023-06-10T14:35:17.0134687Z     java.lang.OutOfMemoryError at Arrays.java:3657
2023-06-10T14:35:17.0134963Z 
2023-06-10T14:35:17.0135292Z protect.card_locker.LoyaltyCardViewActivityTest > startWithMissingLoyaltyCard STARTED
2023-06-10T14:35:17.0135659Z 
2023-06-10T14:35:17.0136019Z protect.card_locker.LoyaltyCardViewActivityTest > startWithMissingLoyaltyCard FAILED
2023-06-10T14:35:17.0136868Z     java.lang.OutOfMemoryError at Provider.java:470

obfusk avatar Jun 10 '23 14:06 obfusk

I guess maybe we should timeout the second time too? And/or kill gradle before retrying?

obfusk avatar Jun 10 '23 15:06 obfusk

I finally had it happen locally, once. CPU fan was blowing hard. So some kind of busy loop or something. Sadly no way to debug when it's this infrequent.

obfusk avatar Jul 16 '23 20:07 obfusk