ComiRec icon indicating copy to clipboard operation
ComiRec copied to clipboard

Book dataset statistics can't align

Open Tokkiu opened this issue 3 years ago • 0 comments

Hi, I use your provided preprocess script to process book dataset. The data file is also downloaded at the website as you mentioned. However, I got the book statistics as follows:

total items: 367982 total users: 603668 total behaviors: 8898041

While the processed data you provided is as follows: total items: 313966 total users: 459133 total behaviors: 8898041

All I just did was:

  1. Download the dataset from http://jmcauley.ucsd.edu/data/amazon/index.html
  2. Decompress the file to get reviews_Books_5.json
  3. Run script python preprocess/data.py book

The misalignment makes me confused. Could you elaborate on it or publish the latest version of data.py?

Thank you for your feedback!

Tokkiu avatar Oct 12 '22 08:10 Tokkiu