arkouda Testing Suite Updates

The ideas conveyed here were previously explored, but updates were not completed. Arkouda needs more robust scalable testing that interfaces with our benchmarking for simplified maintenance. Key components are listed below:

Single configuration between testing and benchmarks.
Configurable tests. This is already supported in pytest-benchmark, but needs to be expanded to pytest to make sure we have more code coverage and better testing on different scales.
JSON output of test results for more flexible reporting and review.

Steps to Complete Issues will be added for these at a later date.

[x] Remove any of the prototype code from the previous research into this so that we have a clean slate to start. (#2500)
[x] Configure the architecture and configuration files. This will include pytest.ini and conftest.py. We will only want to maintain one set of configuration for testing and benchmarking. Benchmarking will be turned off by default and will be manually enabled when needed. (#2504)
[x] Configure tests to scale and use the same/similar parameters for problem size and configuration as the benchmarks (Configuration included as part of #2504. This will require ongoing updates to handle different cases).
[ ] Ensure the Benchmark Correctness checks from old benchmarking system are configured and runnable.
[x] Convert all tests to new format Simple Conversions
- [x] #2536
- [x] #2537
- [x] #2570
- [x] #2572
- [x] #2583
- [x] #2605
- [x] #2616
- [x] #2620
- [x] #2626
- [x] #2648
- [x] numeric_test.py
- [x] #2573
- [x] #2640
- [x] #2642
- [x] #2651
- [x] #2654
- [x] #2656
- [x] #2700
- [x] #2705
- [x] #2668 Complex Conversions
- [x] #2686
- [x] #2538
- [x] #2625
- [x] #2559
- [x] #2697
- [x] #2694
- [x] #2607
- [x] #2539
- [x] #2542
- [x] #2610
- [x] #2688
- [x] #2659
- [x] #2547
- [x] #2684
check.py should be reviewed to determine if any testing needs to be moved to another file. Otherwise, it will be removed.
[ ] Replace existing test suite with the new and update make commands to run updated code.
[ ] Configure benchmarks so that testing and benchmarks use same configuration files
[x] #2768
[x] Ensure CI is functional. Add flake8 check to CI for tests

Jun 13 '23 19:06 Ethan-DeBandi99

maybe we add create dashboards to display JSON output? im not sure about that one which is why i didn't edit, but the rest looks like the right way forward to me!

Jun 13 '23 19:06 stress-tess

Just leaving some of the notes on how to adjust configuration for this to be tested and not interfere with the existing test code. This may change as we continue through development. We want to be able to maintain the existing framework until the new suite is ready to replace it, but do not want to cause any issues between them.

To run with the Prototype Test Suite: This will allow make test and python3 -m pytest to leverage the prototype code for testing.

Rename tests directory to OLD_tests
Rename PROTO_tests directory to tests
Rename pytest.ini to pytest_OLD.ini
Rename pytest_PROTO.ini to pytest.ini

To revert before push so existing test suite runs:

Rename tests directory to PROTO_tests
Rename OLD_tests directory to tests
Rename pytest.ini to pytest_PROTO.ini
Rename pytest_OLD.ini to pytest.ini

Additional Notes The following modifications will not interfere with functionality, but prevent additional changes from being required when switching between suites. These should be removed when the Prototype suite is released, but will not negatively impact anything if they are not.

In pytest_PROTO.ini under norecursedirs OLD_tests and benchmark* this will prevent the conftest.py files in these directories from interfering with incremental changes to the prototype.
In pytest.ini under norecursedirs, PROTO_tests will prevent the conftest.py from being read and interfering with existing tests.

Jun 15 '23 16:06 Ethan-DeBandi99

@pierce314159, @hokiegeek2 - Wanted to get your input on how we should handle problem size because I think there are a few ways we can go about it. First, I don't think that we need the problem applied to every test case, but for example for IO we definitely need to be able to adjust the problem size. I have laid out some options below.

Option 1 Problem size is configurable as a single integer value. This is what we currently use for benchmarking, and in that situation I think it makes sense because we want to run a large scale test for performance measurements, but are not necessarily worried about testing edge cases.

Option 2 Configure the parser to allow the problem size to be problem size to be a comma delimited list of sizes to test with. We can then use the pytest parameterize feature to run any test with variable problem size for each of the supplied problem sizes. This is then recorded as individual test runs for each size. This would still allow for a single test size to be provide as well if not doing multi-locale testing. For anything that only needs to run a single size we could just grab a value from the list. Although, we could parameterize anything using the problem size for ultimate flexibilty.

Option 3 This would use a scaled problem size. So the input would be similar to Option 1, but the configuration code would then take the provided value and apply a multiplier. This would be a little less exact especially when testing features that are susceptible to variance with data distribution.

Option 4 For tests that we know we need to test with varying problem set sizes, we can hardcode sizes to use and ignore the problem size provided in the configuration.

Personally, I am leaning towards option 2 because it will give us ultimate flexibility and the parameters are recorded in the JSON output automatically so we will not need to add code to add the meta data to the JSON output. I wanted to get input before putting the initial architecture PR up so I can adjust a few tests to reflect the decision here.

Jun 16 '23 13:06 Ethan-DeBandi99

@Ethan-DeBandi99 I also vote option 2

Jun 16 '23 13:06 hokiegeek2

I set up the tests currently using the problem size from the configuration to use Option 2. Verified that it has the desired outcome. I do want to note that we will need to update how the benchmarks are using the problem size when we integrate them with the updated tests to use the same configuration.

Jun 16 '23 13:06 Ethan-DeBandi99

Just noting when I run make test I always see the following error, which based on googling seems like pytest is unable to find any tests to run

make: *** [test-python] Error 5

If I update this line in pytest_PROTO.ini to just be tests/*_test, it works just fine https://github.com/Bears-R-Us/arkouda/blob/cdd15d30f59512ff76442b4b7977538f8060d96c/pytest_PROTO.ini#L9

This is pretty weird because it needs to be a different value of my laptop than ethans. It's not pressing since we can all run the tests, but we should def figure this out at some point

Jul 13 '23 18:07 stress-tess

@pierce314159 I am starting to see the same issue. Let's keep an eye on it and if it continues, we can update.

Jul 17 '23 13:07 Ethan-DeBandi99

This is a future work idea, but I wanted to capture it so i don't forget when i come back

I think eventually a nice thing way to do at least edge case testing if to have a big dictionary that uses a str of the objtype/dtype as the key and value is the edge case arkouda object. This would live somewhere accessible by all the test files. This is an oversimplification but I think this could work pretty well with some tweaking

If we wanna grab a dictionary of edge cases with only certain dtypes, we could pass in those column names

edge_case_dict = global_edge_case_dict[dtypes]

and if we want only arrays of each edge case, we could do something like

edge_case_arr = global_edge_case_dict[dtype]

Jul 25 '23 20:07 stress-tess

arkouda arkouda copied to clipboard

Testing Suite Updates

arkouda
arkouda copied to clipboard