Scenario-Wise-Rec
Scenario-Wise-Rec copied to clipboard
Benchmark for Multi-Scenario-Recommendation.
1. Introduction
Scenario-Wise Rec, an open-sourced benchmark for multi-scenario/multi-domain recommendation.
Dataset introduction
Dataset | Domain number | # Interaction | # User | # Item |
---|---|---|---|---|
MovieLens | Domain 0 | 210,747 | 1,325 | 3,429 |
Domain 1 | 395,556 | 2,096 | 3,508 | |
Domain 2 | 393,906 | 2,619 | 3,595 | |
KuaiRand | Domain 0 | 2,407,352 | 961 | 1,596,491 |
Domain 1 | 7,760,237 | 991 | 2,741,383 | |
Domain 2 | 895,385 | 171 | 332,210 | |
Domain 3 | 402,366 | 832 | 547,908 | |
Domain 4 | 183,403 | 832 | 43,106 | |
Ali-CCP | Domain 0 | 32,236,951 | 89,283 | 465,870 |
Domain 1 | 639,897 | 2,561 | 188,610 | |
Domain 2 | 52,439,671 | 150,471 | 467,122 | |
Amazon | Domain 0 | 198,502 | 22,363 | 12,101 |
Domain 1 | 278,677 | 39,387 | 23,033 | |
Domain 2 | 346,355 | 38,609 | 18,534 | |
Douban | Domain 0 | 227,251 | 2,212 | 95,872 |
Domain 1 | 179,847 | 1,820 | 79,878 | |
Domain 2 | 1,278,401 | 2,712 | 34,893 | |
Mind | Domain 0 | 26,057,579 | 737,687 | 8,086 |
Domain 1 | 11,206,494 | 678,268 | 1,797 | |
Domain 2 | 10,237,589 | 696,918 | 8,284 | |
Domain 3 | 9,226,382 | 656,970 | 1,804 |
Model introduction
Model | model_name | Link |
---|---|---|
Shared Bottom | sharedbottom | Link |
MMOE | mmoe | Link |
PLE | ple | Link |
SAR-Net | sarnet | Link |
STAR | star | Link |
M2M | m2m | Link |
AdaSparse | adasparse | Link |
AdaptDHM | adaptdhm | Link |
EPNet | ppnet | Link |
PPNet | epnet | Link |
HAMUR | hamur | Link |
M3oE | m3oe | Link |
2. Installation
WARNING: Our package is still being developed, feel free to post issues if there are any usage problems.
Install via GitHub (Recommended)
First, clone the repo:
git clone https://github.com/Xiaopengli1/Scenario-Wise-Rec.git
Then,
cd Scenario-Wise-Rec
Then use pip to install our package:
pip install .
3. Usage
We provide running scripts for users. See /scripts
, dataset samples are provided in /scripts/data
. You could directly test it by simply do (such as for Ali-CCP):
python run_ali_ccp_ctr_ranking_multi_domain.py --model [model_name]
For full-dataset download, refer to the following steps.
Step 1: Full Datasets Download
Four multi-scenario/multi-domain datasets are provided. See the following table.
Dataset | Domain Number | Users | Items | Interaction | Download |
---|---|---|---|---|---|
Movie-Lens | 3 | 6k | 4k | 1M | ML_Download |
KuaiRand | 5 | 1k | 4M | 11M | KR_Download |
Ali-CCP | 3 | 238k | 467k | 85M | AC_Download |
Amazon | 3 | 85k | 54k | 823k | AZ_Download |
Douban | 3 | 2k | 210k | 1.7M | DB_Download |
Mind | 4 | 748k | 20k | 56M | MD_Download |
Substitute the full-dataset with the sampled dataset.
Step 2: Run the Code
python run_movielens_rank_multi_domain.py --dataset_path [path] --model_name [model_name] --device ["cpu"/"cuda:0"] --epoch [maximum epoch] --learning_rate [1e-3/1e-5] --batch_size [2048/4096] --seed [random seed]
4. Tutorial
To facilitate a seamless experience, we have developed a comprehensive Colab tutorial that guides you through every essential step required to utilize this benchmark effectively. This tutorial is designed with user-friendliness in mind and covers the following key aspects:
- Package Installation
- Data Download
- Model/Data Loading
- Model Training
- Result Evaluation
Each section of the tutorial is designed to be self-contained and easy to follow, making it a valuable resource whether you are a beginner or an experienced user.
5. Build Your Own Multi-scenario Dataset/Model
We offer two template files run_example.py and base_example.py for a pipeline to help you to process different multi-scenario dataset and your own multi-scenario models.
Instructions on Processing Your Dataset
See run_example.py.
The function get_example_dataset(input_path)
is an example to process your dataset. Be noted the feature
"domain_indicator"
is the feature that indicates domains. For other implementation details, refer to the raws file.
Instructions on Building Your Model
See base_example.py. Where you could build your own multi-scenario model here. We left two spaces for users to implement scenario-shared and scenario-specific models. We also leave comments on how to process the final output. Please refer to the raws file to see more details.
6. Contributing
We welcome any contribution that could help improve the benchmark, and don't forget to star 🌟 our project!
7. Credits
The framework is referred to Torch-RecHub. Thanks to their contribution.
8.Citation
You could cite our paper if you find this repository interesting or helpful:
@article{li2024scenario,
title={Scenario-Wise Rec: A Multi-Scenario Recommendation Benchmark},
author={Li, Xiaopeng and Gao, Jingtong and Jia, Pengyue and Wang, Yichao and Wang, Wanyu and Wang, Yejing and Wang, Yuhao and Guo, Huifeng and Tang, Ruiming},
journal={arXiv preprint arXiv:2412.17374},
year={2024}
}