facebook-page-post-scraper
facebook-page-post-scraper copied to clipboard
Scraper not getting all posts from a group
Received multiple comments from people hitting this. Investigating.
I suspect its a Graph API issue and not your code. What I've found useful is to scrape posts in hourly windows; this takes a lot longer but seems to work better?
@josesho , How to make scrape by specified hourly ? Can you write more about that ?
This is also happening for posts on pages, I get only about 300 back. Any ideas on how to fix this? @josesho does your approach also work for pages? How do you do it?
Looks like the API endpoint stops returning posts after 500 entries (on Pages atleast), even on older API versions. (specifically, there is no next
url, and using the paging_token
to simulate the next
URL will return an empty set)
data:image/s3,"s3://crabby-images/bd3dd/bd3ddd366b9166c9f833dc1de229d6a12890cc08" alt="screen shot 2017-05-01 at 8 03 20 am"
This won't break the scrapers which want recent information, but it will break the ones which want to gather all data from a group. That is unfortunate, but there is likely no workaround.
I will update the README where appropriate.
That is very unfortunate, especially because it's never mentioned in the documentation, but oh well.
Apparently this was a bug in the API, should work again now.
Yes, no longer seeing the behavior above, but the alternate behavior could be a portent of things to come.