[BUG] Testing in CI / local dev environments fails when all tests are run at once
Bug description
When we run all the unit tests in one pass, some of the tests related to sampling classes fail. These same tests pass when run individually. (We observed this failure in the CI and in the local dev environment)
Steps/Code to reproduce bug
- Pull the commit 0d932adb3fce55df7b8b0ad4ef2775d0982c223e from this PR
- Run all the tests in your local environment.
Expected behavior
All the tests should pass.
Environment details
- Merlin version:
- Platform: Linux-5.13.0-35-generic-x86_64-with-glibc2.17
- Python version: 3.8.12
- Tensorflow version (GPU?): 2.8.0 (True)
Additional context
With Gabriel, we tried to debug the issue and it seems that the training variable is somehow set to False and is considered as a global variable for all the subsequent tests. This causes the samplers to fail because in the training=False mode no samples are added to the queue.
Not a blockder for 22.04
A similar issue seems to happen with Transformers4rec - and we're just skipping the affected tests https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/401/files until this is resolved.
@sararb can we close this issue?