slideflow
slideflow copied to clipboard
Added brute force method to preserved site k-fold split
In a083554, I add a unit test that ensures the generated splits are valid (all patients are used, no site is present in multiple cross-folds). The test generated data was copied from your original submission.
Running the test only requires:
python3 crossfolds_test.py
In 57ac792, I made formatting changes with minor refactoring only to improve readability - the underlying algorithm is the same. Code that is easier to read is also easier to refactor and optimize. The kinds of changes I made include:
- Added typing and docstring to the function declaration, to make it clear what the input arguments are and what the function does.
- Broke long code blocks into smaller discrete sections, with accompanying comments to explain what each section does.
- More succinct and easily interpretable variable names
- Line length of 80
- List comprehension to reduce the number of nested loops
With the unit testing now available, we can ensure that the splits that are being generated are valid. Running the unit test both pre- and post-refactor raises an error, indicating an issue with the algorithm.
For this next step, I'll have you familiarize yourself with the modified code, and track down the cause of the failed unit test. Once the unit test passes indicating that the algorithm is complete, we will move on to optimization.