BERT4Rec-VAE-Pytorch icon indicating copy to clipboard operation
BERT4Rec-VAE-Pytorch copied to clipboard

question about the result

Open 1245244103 opened this issue 3 years ago • 8 comments

Why your results are so much better than the original paper? is there anything different?

1245244103 avatar Jul 23 '21 10:07 1245244103

The sampling strategy! you should use popular and you should reimplemented one according to original paper

Zibo-Zhao avatar Aug 10 '21 17:08 Zibo-Zhao

Where are the negative samples even used for Bert? I don't see it referenced anywhere in the code other than being initialized.

zanussbaum avatar Oct 06 '21 22:10 zanussbaum

You may want to have a look the paper BERT4Rec and SASRec, where two different negative sample strategies are listed. The implementation of this library is not corresponding to neither of them.


From: Zach Nussbaum @.> Sent: 06 October 2021 23:14 To: jaywonchung/BERT4Rec-VAE-Pytorch @.> Cc: ZHAO Zibo @.>; Comment @.> Subject: Re: [jaywonchung/BERT4Rec-VAE-Pytorch] question about the result (#12)

This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe.

Where are the negative samples even used for Bert? I don't see it referenced anywhere in the code other than being initialized.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jaywonchung/BERT4Rec-VAE-Pytorch/issues/12#issuecomment-937245460, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AREKLLEZK4KWXXTLCEVS74TUFTC4HANCNFSM5A3XGPKQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Zibo-Zhao avatar Oct 07 '21 09:10 Zibo-Zhao

How is the sampling so much different?

the common strategy in [12, 22, 49], pairing each ground truth item
in the test set with 100 randomly sampled negative items that the
user has not interacted with. To make the sampling reliable and
representative [19], these 100 negative items are sampled according
to their popularity. Hence, the task becomes to rank these negative
items with the ground truth item for each user

is the sampling described in the paper, and from what it looks like, is the sampling strategy here.

The only difference I can see here is that random popular sampling isn't used and instead the most popular samples are used. I'm surprised this would yield a 2x improvement over the original BERT4Rec paper.

zanussbaum avatar Oct 07 '21 19:10 zanussbaum

The implementation of this library is not correct (I believe), because most of the negative samples will not change with the negative sample seed, which will result in overfitting if you tune hyperparameters on validation set. For results, random sampling > popular sampling > top popular sampling. And for why this happen, I consider that the model itself may have popularity bias that learnt from statistics of training set, as a result is harder for the model to distinguish ground truth label from popular candidates set.


From: Zach Nussbaum @.> Sent: 07 October 2021 20:01 To: jaywonchung/BERT4Rec-VAE-Pytorch @.> Cc: ZHAO Zibo @.>; Comment @.> Subject: Re: [jaywonchung/BERT4Rec-VAE-Pytorch] question about the result (#12)

This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe.

How is the sampling so much different?

the common strategy in [12, 22, 49], pairing each ground truth item in the test set with 100 randomly sampled negative items that the user has not interacted with. To make the sampling reliable and representative [19], these 100 negative items are sampled according to their popularity. Hence, the task becomes to rank these negative items with the ground truth item for each user```

is the sampling described in the paper, and from what it looks like, is the sampling strategy here.

The only difference I can see here is that random popular sampling isn't used and instead the most popular samples are used. I'm surprised this would yield a 2x improvement over the original BERT4Rec paper.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jaywonchung/BERT4Rec-VAE-Pytorch/issues/12#issuecomment-938070961, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AREKLLHEII7WANECYEEKMJTUFXU7FANCNFSM5A3XGPKQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Zibo-Zhao avatar Oct 08 '21 06:10 Zibo-Zhao

The implementation of this library is not correct (I believe), because most of the negative samples will not change with the negative sample seed, which will result in overfitting if you tune hyperparameters on validation set. For results, random sampling > popular sampling > top popular sampling. And for why this happen, I consider that the model itself may have popularity bias that learnt from statistics of training set, as a result is harder for the model to distinguish ground truth label from popular candidates set. ________________________________ From: Zach Nussbaum @.> Sent: 07 October 2021 20:01 To: jaywonchung/BERT4Rec-VAE-Pytorch @.> Cc: ZHAO Zibo @.>; Comment @.> Subject: Re: [jaywonchung/BERT4Rec-VAE-Pytorch] question about the result (#12) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. How is the sampling so much different? the common strategy in [12, 22, 49], pairing each ground truth item in the test set with 100 randomly sampled negative items that the user has not interacted with. To make the sampling reliable and representative [19], these 100 negative items are sampled according to their popularity. Hence, the task becomes to rank these negative items with the ground truth item for each user``` is the sampling described in the paper, and from what it looks like, is the sampling strategy here. The only difference I can see here is that random popular sampling isn't used and instead the most popular samples are used. I'm surprised this would yield a 2x improvement over the original BERT4Rec paper. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#12 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AREKLLHEII7WANECYEEKMJTUFXU7FANCNFSM5A3XGPKQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Recsys.pdf Different negative sampling ways may cause nearly 2x improvement, see related research in this paper[2020, RecSys] 【Figure 4.]

ghost0913 avatar Nov 19 '21 09:11 ghost0913

采样策略!你应该使用流行的,你应该根据原论文重新实现一个

Why did I use popular sampling and the results are much less than in the paper

liwenwei5110 avatar Dec 17 '21 02:12 liwenwei5110

The sampling strategy! you should use popular and you should reimplemented one according to original paper

Does both the training set and the test set use popular sampling?

liwenwei5110 avatar Dec 17 '21 03:12 liwenwei5110