oneDAL
oneDAL copied to clipboard
Probably DAAL_CHECK_EX error for fullNUsers in implicit_als
Describe the bug I am developing a distributed implicit_als. And encountered this exception when doing init.
terminate called after throwing an instance of 'daal::services::interface1::Exception'
what(): Incorrect parameter
Details:
Parameter name: fullNUsers
If I divided the data into more partitions, the exception disappeared, so I think my partitioning method is correct. I read the code and it leads to these two lines.
https://github.com/oneapi-src/oneDAL/blob/master/cpp/daal/src/algorithms/implicit_als/implicit_als_train_init_partial_result.cpp#L148
Since the input data is transposed, algParameter->fullNUsers may need to compare to the number of columns which is nUsers.
Better to get original author to have a quick check. I didn't build the oneDAL code so I am not sure if this is where the problem came from. Pls close the issue if I am wrong.
To Reproduce Partition the transposed input matrix in the way that each partition is big enough that row number of each partition matrix is larger than fullNUsers.
Expected behavior no exception thrown
@PIVOVAR3AL
@xwu99, issue is fixed by https://github.com/oneapi-src/oneDAL/pull/1493. Please, verify that your case is working now.
@Alexsandruss Thanks for the fix. I will check. And I have opened several issues related to ALS. #1476 #1513 #1514 . The blocking one is #1513.