CacheLib icon indicating copy to clipboard operation
CacheLib copied to clipboard

Test failures on OSS platform

Open sathyaphoenix opened this issue 2 years ago • 5 comments

Discussed in https://github.com/facebook/CacheLib/discussions/61

Originally posted by vicvicg October 11, 2021 When running CacheLib tests following the instructions (https://cachelib.org/docs/installation/testing), we get different test pass rates depending on the environment, some test failures seem to be intermittent, and we haven’t seen 100% pass rate. Is there a recommended system set up and subset of tests that we can use as an acceptance criteria for code changes?

NvmCacheTests.ConcurrentFills failure :

I0930 19:26:45.051738  8699 BigHash.cpp:110] Reset BigHash
I0930 19:26:45.051754  8699 BlockCache.cpp:611] Reset block cache
/opt/workspace/cachelib/allocator/nvmcache/tests/NvmCacheTests.cpp:385: Failure
Expected: (hdl) != (nullptr), actual: nullptr vs (nullptr)
/opt/workspace/cachelib/allocator/nvmcache/tests/NvmCacheTests.cpp:385: Failure
Expected: (hdl) != (nullptr), actual: nullptr vs (nullptr)

Timer tests failure: This seems like a poorly written test that does not account for timing in code with sleep

Running main() from /opt/workspace/cachelib/external/googletest/googletest/src/gtest_main.cc
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from Util
[ RUN      ] Util.TimerTest
/opt/workspace/cachelib/common/tests/TimeTests.cpp:40: Failure
Expected equality of these values:
  timer.getDurationMs()
    Which is: 1487
  rnd
    Which is: 1484
[  FAILED  ] Util.TimerTest (1487 ms)
[----------] 1 test from Util (1487 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1487 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Util.TimerTest

 1 FAILED TEST

sathyaphoenix avatar Oct 13 '21 03:10 sathyaphoenix

Here's contents of dockerfile that we use to build containers to reproduce the mentioned failures: FROM registry.hub.docker.com/library/centos:8

RUN dnf install -y
sudo
git
tzdata
vim
gdb
clang

COPY contrib/prerequisites-centos8.sh prerequisites-centos8.sh RUN sed 's/sudo //' -i prerequisites-centos8.sh RUN ./prerequisites-centos8.sh

Docker run command:

docker run --rm --name vic --tmpfs /tmp -v /home/vic/cachlib/:/opt/workspace:z -e http_proxy=<…> -e https_proxy=<…> -it cachelib:centos-8 /bin/bash

vicvicg avatar Oct 13 '21 18:10 vicvicg

@vicvicg - thanks, I'm working on reproducing it locally. Does your docker environment enforces any additional restrictions (e.g. seccomp, apparmor, or similar) ?

agordon avatar Oct 13 '21 20:10 agordon

@vicvicg - thanks, I'm working on reproducing it locally. Does your docker environment enforces any additional restrictions (e.g. seccomp, apparmor, or similar) ?

@agordon: No, our docker environment doesn't enforce any additional restrictions.

vicvicg avatar Oct 14 '21 13:10 vicvicg

Hi! We found a problem in NvmCacheTests.ConcurrentFills that under a certain race condition it fails. We will be working on fixing this.

haowu14 avatar Jul 07 '22 03:07 haowu14

@vicvicg Do you still see test failures in the built?

haowu14 avatar Aug 22 '22 18:08 haowu14