adam icon indicating copy to clipboard operation
adam copied to clipboard

Test results are non-deterministic

Open traversaro opened this issue 1 year ago • 5 comments

For example, see this output of two tests runs on the same commit:

  • https://github.com/ami-iit/ADAM/actions/runs/4977875545/jobs/8907456613?pr=41
  • https://github.com/ami-iit/ADAM/actions/runs/4977875540/jobs/8907456632?pr=41

The reason for this is that we call np.random, but we do not set the seed, so the test results are different at every run (see https://adamj.eu/tech/2018/01/08/pytest-randomly-history/ and https://towardsdatascience.com/random-seeds-and-reproducibility-933da79446e3). The long term plan may be to implement some kind of way of controlling randomness (for example via https://github.com/pytest-dev/pytest-randomly), but in the short term perhaps the easy fix is to increase the test threshold.

traversaro avatar May 15 '23 08:05 traversaro

CC @ami-iit/artificial-mechanical-intelligence

DanielePucci avatar May 15 '23 14:05 DanielePucci

Thanks @traversaro! :) Agreed! I'll open a PR for increasing the tolerance, but I'll plan to address this issue in a more structured way. P.S. I do not really understand why Jax and PyTorch tests are the only ones failing. Is there some strange interaction between these two frameworks and numpy?

P.P.S. Maybe it's a stupid solution. What if I set np.random.seed(0)? I guess it will always give me the same sequence of random numbers.

Giulero avatar May 16 '23 09:05 Giulero

P.P.S. Maybe it's a stupid solution. What if I set np.random.seed(0)? I guess it will always give me the same sequence of random numbers.

That for sure should work fine. Reading https://adamj.eu/tech/2018/01/08/pytest-randomly-history/ and similar tests, it seems that people do not like it as you perturb the global state and so you could influence other tests, but for our specific case it should work fine (that is what we do in iDynTree, for example: https://github.com/robotology/idyntree/blob/35b0f76a9db3809384e8ebcbdb7cfb11d2cb7a7b/bindings/python/tests/joints.py#L31 and https://github.com/robotology/idyntree/blob/35b0f76a9db3809384e8ebcbdb7cfb11d2cb7a7b/src/estimation/tests/KalmanFilterUnitTest.cpp#L84).

P.S. I do not really understand why Jax and PyTorch tests are the only ones failing. Is there some strange interaction between these two frameworks and numpy?

I guess that for some reason on some joint configuration the numeric error induced by how these frameworks make the computation is bigger, but it is just an intuition.

traversaro avatar May 16 '23 11:05 traversaro

That for sure should work fine. Reading https://adamj.eu/tech/2018/01/08/pytest-randomly-history/ and similar tests, it seems that people do not like it as you perturb the global state, and so you could influence other tests, but for our specific case it should work fine

So I could start with this approach and then proceed with a more refined solution.

I guess that for some reason on some joint configuration the numeric error induced by how these frameworks make the computation is bigger, but it is just an intuition.

I suspect the same. I did some tests and it seems that setting torch.set_default_dtype(torch.float64) and config.update("jax_enable_x64", True) (as in #39) along with np.random.seed(42) does the job, and tests don't fail.

Giulero avatar May 16 '23 11:05 Giulero

Just for a log, the proposed solution in https://github.com/ami-iit/ADAM/issues/42#issuecomment-1549505862 is implemented in #39.

Giulero avatar May 16 '23 12:05 Giulero