datajoint-python
datajoint-python copied to clipboard
Pytest `test_datetime_serialization_speed` occasionally fails
Bug Report
Description
The pytest tests/test_blob.py::test_datetime_serialization_speed fails non-deterministically.
Reproducibility
Include:
- OS: Ubuntu 23.04
- CPU: Intel i7-7600U (4) @ 3.900GHz
- Mem: 16GB
- Python: 3.8.17
- MySQL Version: 8.0
- MySQL Deployment Strategy (local-native | local-docker | remote):
local-docker(docker://datajoint/mysql:8.0) - DataJoint Version:
0.14.1commit 03db252aaf956e6cd71e08241eb16b3de43a1d10
Steps to Reproduce
In a terminal window in the VSCode DevContainer on dev-tests branch (03db252aaf956e6cd71e08241eb16b3de43a1d10), issue for i in {1..10}; do pytest tests/test_blob.py::test_datetime_serialization_speed; done to run the test 10 times. On my laptop, the test fails once out of 10 times. Since this issue seems hardware-dependent, it might be impossible to reproduce on some systems.
Error Stack
Console Logs
$ for i in {1..10}; do pytest tests/test_blob.py::test_datetime_seria
lization_speed; done
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 2.57s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 2.73s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 2.72s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 2.75s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py F [100%]
================================================================ FAILURES =================================================================
____________________________________________________ test_datetime_serialization_speed ____________________________________________________
def test_datetime_serialization_speed():
# If this fails that means for some reason deserializing/serializing
# np arrays of np.datetime64 types is now slower than regular arrays of datetime
optimized_exe_time = timeit.timeit(
setup="myarr=pack(np.array([np.datetime64('2022-10-13 03:03:13') for _ in range(0, 10000)]))",
stmt="unpack(myarr)",
number=10,
globals=globals(),
)
print(f"np time {optimized_exe_time}")
baseline_exe_time = timeit.timeit(
setup="myarr2=pack(np.array([datetime(2022,10,13,3,3,13) for _ in range (0, 10000)]))",
stmt="unpack(myarr2)",
number=10,
globals=globals(),
)
print(f"python time {baseline_exe_time}")
> assert optimized_exe_time * 900 < baseline_exe_time
E assert (0.00273081300110789 * 900) < 2.3093061309991754
tests/test_blob.py:233: AssertionError
---------------------------------------------------------- Captured stdout call -----------------------------------------------------------
np time 0.00273081300110789
python time 2.3093061309991754
========================================================= short test summary info =========================================================
FAILED tests/test_blob.py::test_datetime_serialization_speed - assert (0.00273081300110789 * 900) < 2.3093061309991754
============================================================ 1 failed in 2.75s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 2.91s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 2.70s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 3.70s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 3.91s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item
tests/test_blob.py . [100%]
============================================================ 1 passed in 3.09s ============================================================
Expected Behavior
The pytest tests/test_blob.py::test_datetime_serialization_speed passes every time without failing, regardless of hardware.
Screenshots
N/A
Additional Research and Context
This test fails often (~10% of invocations) when it is run on my laptop (2017 ThinkPad X1 Carbon) in a DevContainer. The test fails very infrequently (~1% of invocations) when run in CI; one example is on this CI run in #1133. I suspect that a slow or busy CPU causes packing of np.datetime64 arrays to happen unexpectedly slowly:
https://github.com/ethho/datajoint-python/blob/e755b9763d950137718fb8447967ceef5172f57d/tests/test_blob.py#L218-L223
optimized_exe_time = timeit.timeit(
setup="myarr=pack(np.array([np.datetime64('2022-10-13 03:03:13') for _ in range(0, 10000)]))",
stmt="unpack(myarr)",
number=10,
globals=globals(),
)
Additional Context
As of 12/18/2023, this test doesn't exist on branch master. This issue was discovered during development of #1142.