GoodreadsScraper icon indicating copy to clipboard operation
GoodreadsScraper copied to clipboard

Scrape data from Goodreads using Scrapy and Selenium :books:

Results 7 GoodreadsScraper issues
Sort by recently updated
recently updated
newest added

The CSS extractors in [`book_spider`](https://github.com/havanagrawal/GoodreadsScraper/blob/master/GoodreadsScraper/spiders/book_spider.py#L21-L42) etc can get out of sync, and the only way to detect this is with a trial run Solution: Add a unit test that retrieves...

If the `dateutil.parse` function cannot find a component of the timestamp (any of day, month or year), it replaces it with the *current* date's components. This can cause problems in...

Thought it would be nice to embed cover image to the jl files so people can use them for different purposes

This PR adds a method to `crawl.py` to scrape the books of a single author. Example: `python -m crawl single-author --author_id 19520462.Arlan_Hamilton` It is based on the scraper for a...

Hey, have you tried adding a crawler for scraping the genre pages of goodreads, like: https://www.goodreads.com/shelf/show/war?page=1, I tried it, but it always goes for page 1 only. Even if I...

I just crawled my to-read list and only in ~20% of the cases I got all the info for the books. The rest is just links. Such as [this one](https://www.goodreads.com/book/show/40376072-children-of-ruin),...

Hey, I'm having issues similar to Issue #17 where the .jl book file seems to be sporadically displaying some results with only the URL while others are complete with all...