oneDAL icon indicating copy to clipboard operation
oneDAL copied to clipboard

Probably DAAL_CHECK_EX error for fullNUsers in implicit_als

Open xwu99 opened this issue 4 years ago • 3 comments

Describe the bug I am developing a distributed implicit_als. And encountered this exception when doing init.

terminate called after throwing an instance of 'daal::services::interface1::Exception'
  what():  Incorrect parameter
Details:
Parameter name: fullNUsers

If I divided the data into more partitions, the exception disappeared, so I think my partitioning method is correct. I read the code and it leads to these two lines.
https://github.com/oneapi-src/oneDAL/blob/master/cpp/daal/src/algorithms/implicit_als/implicit_als_train_init_partial_result.cpp#L148

Since the input data is transposed, algParameter->fullNUsers may need to compare to the number of columns which is nUsers.

Better to get original author to have a quick check. I didn't build the oneDAL code so I am not sure if this is where the problem came from. Pls close the issue if I am wrong.

To Reproduce Partition the transposed input matrix in the way that each partition is big enough that row number of each partition matrix is larger than fullNUsers.

Expected behavior no exception thrown

xwu99 avatar Dec 01 '20 15:12 xwu99

@PIVOVAR3AL

xwu99 avatar Dec 01 '20 15:12 xwu99

@xwu99, issue is fixed by https://github.com/oneapi-src/oneDAL/pull/1493. Please, verify that your case is working now.

Alexsandruss avatar Mar 23 '21 17:03 Alexsandruss

@Alexsandruss Thanks for the fix. I will check. And I have opened several issues related to ALS. #1476 #1513 #1514 . The blocking one is #1513.

xwu99 avatar Mar 24 '21 01:03 xwu99