xA-Scraper
xA-Scraper copied to clipboard
Fixed Patreon page iteration
This fixes the page iteration on Patreon. The previous mechanism was (for whatever reason) skipping posts seemingly at random. This one:
- Stores the request string in a variable. Note that the page[cursor] parameter has been obviated for the first request.
- After making the request, the next request string is pulled from the data returned by the API, thus it advances it to the next page. I don't normally work in Python, so I didn't want to make more changes than were necessary to just fix the bug. It may be a good idea to use the presence of the 'next' link to decide whether to continue looping or not, instead of using had_post. Also, the code to strip the 'www.patreon.com/api' is a bit fragile right now. A better solution might be to use a regex to do this.
Stores the request string in a variable. Note that the page[cursor] parameter has been obviated for the first request.
I can't remember the reason for things being structured the way they are, but I'd bet the "next" parameter is something that post-dates the initial implementation. I'd like to think at least if I saw something like that, I'd use it, rather then the cursor-based mess that's currently there.
Anyways, awesome! I'm glad this is working for someone else! Let me know if you want to make the change discussed above, or I can pull it and do it myself.
I'd bet the "next" parameter is something that post-dates the initial implementation.
I figured that was the case.