Yupeng Hou comments

Results 88 comments of


                                            Yupeng Hou

sparsity level analysis

Hi, thanks for your attention! For the sparsity analysis, one possible way is to temporally modify the source code of RecBole 1.0.1. In detail, in `recbole/evaluator/metrics.py`, you can change the...

NotImplementedError: There is no transform named 'plm_emb'

Sorry for the late reply and thanks for pointing out this bug! I've fixed it in [05aa5cb](https://github.com/RUCAIBox/UniSRec/commit/05aa5cba2809112c32808f70d16abc61c05c6538). Please update the code and I think everything will be fine if you...

I would like to know what the purpose of this code

Hi, thanks for your attention! The purpose is to create some soft links in the `pretrain` dir. Take the "Food" dataset for an example, we originally created `Food.feat1CLS` in the...

I would like to know what the purpose of this code

Hi Yihong, Thanks!! I got the point. You are right, and there is indeed a bug here. I'll update the script later and thanks a lot for pointing this! By...

I would like to know what the purpose of this code

1) dataset filtering. We filter out items without metadata, then filter out users/items under 5 interactions (5-core filtering). Please let me know if there are still some other data preprocessing...

the metadata dataset can not be found

Thanks for catching the bug! I've updated the README and please refer to **metadata** links in the "Complete review data" table from [https://nijianmo.github.io/amazon/index.html](https://nijianmo.github.io/amazon/index.html).

Requesting Guidance on Extracting User and Item Embeddings

Hi, sorry for the late reply! ## 1. Loading existing checkpoints Below is an example of loading UniSRec models (either pretrained or fine-tuned). https://github.com/RUCAIBox/UniSRec/blob/05aa5cba2809112c32808f70d16abc61c05c6538/finetune.py#L37-L40 ## 2. Mapping external IDs to...

How to pretrain in multi gpus?

Sorry for the late reply! I guess it's the version mismatch of PyTorch or something. Could you please share the versions of `python`, `torch`, `cudatoolkit`, and `recbole` in your environment,...

How to pretrain in multi gpus?

> python==3.9.7 pytorch==1.11.0 cudatoolkit==11.3.1 recbole=1.1.1 机器是A100 Thanks! I'll try to reproduce the bug and get back to you as soon as I can.

关于预训练的问题

你好，感谢复现实验以及提出这些很好的问题！也抱歉回复这么晚。我们论文中报告的数字是进行超参数调优后，选择验证集上效果最好的一组超参数，再在验证集上进行测试的。我们也观察到未经过预训练的模型可能在某个超参数上验证集效果很好，但测试集上泛化性就较差了。如果只用一组参数，且不重复实验的话可能导致结论相对比较随机。不过我们的确观察到，在 UniSRec 中，推荐模型的预训练增益还是比较受限，在某些时候甚至会出现负迁移的情况。因此我们后面提出了新模型 [VQ-Rec](https://github.com/RUCAIBox/VQ-Rec)，这个模型使用了相同的实验设置，模型可以在只使用文本特征时较为稳定地受益于预训练。 ![WX20240319-010359@2x](https://github.com/RUCAIBox/UniSRec/assets/29252610/6b90b093-371d-4aa4-b85a-3ce0148736d6) 但随着大家对预训练、scaling law 的认识不断更新，如何通过预训练提升推荐模型的效果还是一个待解决的问题，UniSRec 和 VQ-Rec 也可能给出在现在看来值得怀疑的结论（比如预训练模型的参数量应该是多少？微调时允许调整多少？什么情况下可能说明过拟合了？如何判断预训练是否是有必要的？）。