DeepMatch-Torch
DeepMatch-Torch copied to clipboard
Two Bugs regarding YoutubeDNN
There are two bugs related to the codes for YoutubeDNN model.
-
The gen_data_set_youteube has a typo... should be youtube. (Not necessarily a bug lol)
-
Here's the first bug: gen_data_set_youteube will produce the negative samples ONLY, without any positive samples. Consequently all training labels will be 0.
-
The second one: [neg_list[item_idx] for item_idx in np.random.choice(neg_list, negsample)] is not correct. It should directly call the indexes.
- you should also look the
YouTubeDNN
model file. Ingen_data_set_youtube
,what we can is: [1, 0, 0, 0...],and we set the first one is positive, else as negative to simulateSampledSoftmax
; To see more information, you can see here - Maybe it is a little confused. But it is truely the index of items. Because we normalize the item id to idex;
hope for you reply~
Thanks for the explanation! Now it makes sense to me why all labels are 0. Would be helpful to add some comments in the preprocessing.py to prevent confusion in the future!
I am still a bit unsure of the issue on the negtive sampling. Previously when I ran the code an out-of-boundary error occurred. Will let you know if I have updates.