BERT4Rec-VAE-Pytorch
BERT4Rec-VAE-Pytorch copied to clipboard
question about the result
Why your results are so much better than the original paper? is there anything different?
The sampling strategy! you should use popular and you should reimplemented one according to original paper
Where are the negative samples even used for Bert? I don't see it referenced anywhere in the code other than being initialized.
You may want to have a look the paper BERT4Rec and SASRec, where two different negative sample strategies are listed. The implementation of this library is not corresponding to neither of them.
From: Zach Nussbaum @.> Sent: 06 October 2021 23:14 To: jaywonchung/BERT4Rec-VAE-Pytorch @.> Cc: ZHAO Zibo @.>; Comment @.> Subject: Re: [jaywonchung/BERT4Rec-VAE-Pytorch] question about the result (#12)
This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
Where are the negative samples even used for Bert? I don't see it referenced anywhere in the code other than being initialized.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jaywonchung/BERT4Rec-VAE-Pytorch/issues/12#issuecomment-937245460, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AREKLLEZK4KWXXTLCEVS74TUFTC4HANCNFSM5A3XGPKQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
How is the sampling so much different?
the common strategy in [12, 22, 49], pairing each ground truth item
in the test set with 100 randomly sampled negative items that the
user has not interacted with. To make the sampling reliable and
representative [19], these 100 negative items are sampled according
to their popularity. Hence, the task becomes to rank these negative
items with the ground truth item for each user
is the sampling described in the paper, and from what it looks like, is the sampling strategy here.
The only difference I can see here is that random popular sampling isn't used and instead the most popular samples are used. I'm surprised this would yield a 2x improvement over the original BERT4Rec paper.
The implementation of this library is not correct (I believe), because most of the negative samples will not change with the negative sample seed, which will result in overfitting if you tune hyperparameters on validation set. For results, random sampling > popular sampling > top popular sampling. And for why this happen, I consider that the model itself may have popularity bias that learnt from statistics of training set, as a result is harder for the model to distinguish ground truth label from popular candidates set.
From: Zach Nussbaum @.> Sent: 07 October 2021 20:01 To: jaywonchung/BERT4Rec-VAE-Pytorch @.> Cc: ZHAO Zibo @.>; Comment @.> Subject: Re: [jaywonchung/BERT4Rec-VAE-Pytorch] question about the result (#12)
This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
How is the sampling so much different?
the common strategy in [12, 22, 49], pairing each ground truth item in the test set with 100 randomly sampled negative items that the user has not interacted with. To make the sampling reliable and representative [19], these 100 negative items are sampled according to their popularity. Hence, the task becomes to rank these negative items with the ground truth item for each user```
is the sampling described in the paper, and from what it looks like, is the sampling strategy here.
The only difference I can see here is that random popular sampling isn't used and instead the most popular samples are used. I'm surprised this would yield a 2x improvement over the original BERT4Rec paper.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jaywonchung/BERT4Rec-VAE-Pytorch/issues/12#issuecomment-938070961, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AREKLLHEII7WANECYEEKMJTUFXU7FANCNFSM5A3XGPKQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
The implementation of this library is not correct (I believe), because most of the negative samples will not change with the negative sample seed, which will result in overfitting if you tune hyperparameters on validation set. For results, random sampling > popular sampling > top popular sampling. And for why this happen, I consider that the model itself may have popularity bias that learnt from statistics of training set, as a result is harder for the model to distinguish ground truth label from popular candidates set. … ________________________________ From: Zach Nussbaum @.> Sent: 07 October 2021 20:01 To: jaywonchung/BERT4Rec-VAE-Pytorch @.> Cc: ZHAO Zibo @.>; Comment @.> Subject: Re: [jaywonchung/BERT4Rec-VAE-Pytorch] question about the result (#12) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. How is the sampling so much different? the common strategy in [12, 22, 49], pairing each ground truth item in the test set with 100 randomly sampled negative items that the user has not interacted with. To make the sampling reliable and representative [19], these 100 negative items are sampled according to their popularity. Hence, the task becomes to rank these negative items with the ground truth item for each user``` is the sampling described in the paper, and from what it looks like, is the sampling strategy here. The only difference I can see here is that random popular sampling isn't used and instead the most popular samples are used. I'm surprised this would yield a 2x improvement over the original BERT4Rec paper. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#12 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AREKLLHEII7WANECYEEKMJTUFXU7FANCNFSM5A3XGPKQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
Recsys.pdf Different negative sampling ways may cause nearly 2x improvement, see related research in this paper[2020, RecSys] 【Figure 4.]
采样策略!你应该使用流行的,你应该根据原论文重新实现一个
Why did I use popular sampling and the results are much less than in the paper
The sampling strategy! you should use popular and you should reimplemented one according to original paper
Does both the training set and the test set use popular sampling?