ArcticDB
ArcticDB copied to clipboard
Investigate performance disparity between Windows and Linux tests
Is your feature request related to a problem? Please describe. Many tests perform much worse on Windows than on Linux, which is slowing down the overall CI builds. This is then amplified if we have to run many iterations of the same test. For now, the number of iterations has been reduce to bring down the build time.
This was done in the following places:
- test_append.py::gen_params_append
- test_engine.py:;gen_params
- test_engine.py::gen_params_non_contiguous
Describe the solution you'd like We should investigate, if we can improve the performance on Windows se we can increase the number of iteration that can be run in a reasonable time
Investigation findings: I found that the s3 tests are much slower as compared to other storages. Out of a total of ~1500 integration
tests, I skipped all the s3 tests (around 300) and ran all the other tests. The time taken to run these tests was reduced from ~54 mins to ~11 mins on windows and from ~32 mins to ~17 mins on Linux. This showed that s3 was taking most of the time.
I investigated further by benchmarking individual tests and checking the time taken to run various parts. For instance benchmarking on tests/integration/arcticdb/test_arctic.py::test_dedup
showed that the write operation is significantly slower on windows in some instances. I checked the debug logs and found that the s3 server was taking a lot of time to respond to some of the requests.
possible solution: it is worth trying another s3 simulated server such as s3-proxy or fake-s3 (further research needed) which could solve the issue and speed up the tests on windows and linux