DeepMatch-Torch Two Bugs regarding YoutubeDNN

Two Bugs regarding YoutubeDNN

Open NeverSalar opened this issue 2 years ago • 2 comments

There are two bugs related to the codes for YoutubeDNN model.

The gen_data_set_youteube has a typo... should be youtube. (Not necessarily a bug lol)
Here's the first bug: gen_data_set_youteube will produce the negative samples ONLY, without any positive samples. Consequently all training labels will be 0.
The second one: [neg_list[item_idx] for item_idx in np.random.choice(neg_list, negsample)] is not correct. It should directly call the indexes.

Jul 30 '22 03:07 NeverSalar

you should also look the YouTubeDNN model file. In gen_data_set_youtube，what we can is: [1, 0, 0, 0...]，and we set the first one is positive, else as negative to simulate SampledSoftmax; To see more information, you can see here
Maybe it is a little confused. But it is truely the index of items. Because we normalize the item id to idex;

hope for you reply~

Jul 30 '22 08:07 bbruceyuan

Thanks for the explanation! Now it makes sense to me why all labels are 0. Would be helpful to add some comments in the preprocessing.py to prevent confusion in the future!

I am still a bit unsure of the issue on the negtive sampling. Previously when I ran the code an out-of-boundary error occurred. Will let you know if I have updates.

Jul 30 '22 19:07 NeverSalar

DeepMatch-Torch DeepMatch-Torch copied to clipboard

Two Bugs regarding YoutubeDNN

DeepMatch-Torch
DeepMatch-Torch copied to clipboard