goodreads
goodreads copied to clipboard
How were the users chosen?
I see that there are 876,145 total users in the dataset, but goodreads has 90 million users (as of july 2019). I was wondering how were those 876,145 users selected. Was there a minimum number of ratings?
Hi Santosh, the users in these dataset are those who in the top 1000 book clubs (https://www.goodreads.com/group) back to early 2017 & chose to public their book shelves - so they are just a subset of the Goodreads community.
Are there any plans for an entire goodreads user review dataset?
I started a script here, but it needs some work
https://colab.research.google.com/drive/1uOyVlKaT4QFtce9yQpKj9hRtj5z8Uyta
It downloads reviews directly from rss feeds, so it goes pretty fast. It still needs work in confirming it has gotten all the books from a user (I think there might be timeouts) and issues with books that have several versions/editions.