models icon indicating copy to clipboard operation
models copied to clipboard

[BUG] Testing in CI / local dev environments fails when all tests are run at once

Open sararb opened this issue 3 years ago • 3 comments

Bug description

When we run all the unit tests in one pass, some of the tests related to sampling classes fail. These same tests pass when run individually. (We observed this failure in the CI and in the local dev environment)

Steps/Code to reproduce bug

  1. Pull the commit 0d932adb3fce55df7b8b0ad4ef2775d0982c223e from this PR
  2. Run all the tests in your local environment.

Expected behavior

All the tests should pass.

Environment details

  • Merlin version:
  • Platform: Linux-5.13.0-35-generic-x86_64-with-glibc2.17
  • Python version: 3.8.12
  • Tensorflow version (GPU?): 2.8.0 (True)

Additional context

With Gabriel, we tried to debug the issue and it seems that the training variable is somehow set to False and is considered as a global variable for all the subsequent tests. This causes the samplers to fail because in the training=False mode no samples are added to the queue.

sararb avatar Mar 31 '22 12:03 sararb

Not a blockder for 22.04

viswa-nvidia avatar Apr 01 '22 19:04 viswa-nvidia

A similar issue seems to happen with Transformers4rec - and we're just skipping the affected tests https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/401/files until this is resolved.

benfred avatar Apr 06 '22 19:04 benfred

@sararb can we close this issue?

rnyak avatar Jun 09 '22 01:06 rnyak