[BUG] Kmeans balanced test fails when n_cols<=4
Describe the bug
Kmeans balanced test fails when n_cols is less or equal than 4. Those cases are not currently tested but they should not fail. When I add the following test cases to the list, the {1000000, 1, 10}, {1000000, 2, 10}, and {1000000, 4, 10} cases fail. It's not clear to me whether the failure is just because of the n_cols parameter or the combination of the three parameters.
std::vector<std::tuple<size_t, size_t, size_t>> row_cols_k = {{1000, 32, 5},
{1000, 100, 20},
{10000, 32, 10},
{10000, 100, 50},
{10000, 500, 100},
{1000000, 128, 10},
{1000000, 1, 10},
{1000000, 2, 10},
{1000000, 4, 10},
{1000000, 8, 10},
{1000000, 16, 10}};
Test output for failed tests:
[ FAILED ] 30 tests, listed below:
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFU32I32.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFU32I32.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFU32I32.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDDU32I32.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDDU32I32.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDDU32I32.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFU32I64.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFU32I64.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFU32I64.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDDU32I64.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDDU32I64.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDDU32I64.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI32I32.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI32I32.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI32I32.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI32I64.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI32I64.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI32I64.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI64I32.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI64I32.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI64I32.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI64I64.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI64I64.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFFI64I64.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFI8U32I32.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFI8U32I32.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestFI8U32I32.Result/8, where GetParam() = { 1000000, 4, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDI8U32I32.Result/6, where GetParam() = { 1000000, 1, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDI8U32I32.Result/7, where GetParam() = { 1000000, 2, 10, 200}
[ FAILED ] KmeansBalancedTests/KmeansBalancedTestDI8U32I32.Result/8, where GetParam() = { 1000000, 4, 10, 200}
30 FAILED TESTS
Steps/Code to reproduce bug Add the aforementioned test cases to the test list. Build the test and run. You should be able to see the failures as I posted above.
Expected behavior Those tests should pass.
Environment details (please complete the following information):
- Environment location: Bare-metal
- Method of RAFT install: from source
This was tested on an A100-80GB-PCIe system and using the latest commit 88e9a55.
Additional context