facebook-page-post-scraper icon indicating copy to clipboard operation
facebook-page-post-scraper copied to clipboard

Scraper not getting all posts from a group

Open minimaxir opened this issue 7 years ago • 7 comments

Received multiple comments from people hitting this. Investigating.

minimaxir avatar Apr 06 '17 12:04 minimaxir

I suspect its a Graph API issue and not your code. What I've found useful is to scrape posts in hourly windows; this takes a lot longer but seems to work better?

josesho avatar Apr 07 '17 01:04 josesho

@josesho , How to make scrape by specified hourly ? Can you write more about that ?

PhanDuc avatar Apr 07 '17 04:04 PhanDuc

This is also happening for posts on pages, I get only about 300 back. Any ideas on how to fix this? @josesho does your approach also work for pages? How do you do it?

lukaskawerau avatar May 01 '17 14:05 lukaskawerau

Looks like the API endpoint stops returning posts after 500 entries (on Pages atleast), even on older API versions. (specifically, there is no next url, and using the paging_token to simulate the next URL will return an empty set)

screen shot 2017-05-01 at 8 03 20 am

This won't break the scrapers which want recent information, but it will break the ones which want to gather all data from a group. That is unfortunate, but there is likely no workaround.

I will update the README where appropriate.

minimaxir avatar May 01 '17 15:05 minimaxir

That is very unfortunate, especially because it's never mentioned in the documentation, but oh well.

lukaskawerau avatar May 01 '17 15:05 lukaskawerau

Apparently this was a bug in the API, should work again now.

lukaskawerau avatar May 03 '17 07:05 lukaskawerau

Yes, no longer seeing the behavior above, but the alternate behavior could be a portent of things to come.

minimaxir avatar May 03 '17 16:05 minimaxir