facebook-scraper
facebook-scraper copied to clipboard
feature suggestion re: number of requests made
Would it be possible/easy for facebook_scraper
to internally keep track of the number of requests that it makes to Facebook in such a way that users could query the current count?
In my own application code, I try to keep track of the number of queries. The main problem with this is just whether I actually understand what's going on underneath the hood properly and then calculate everything correctly.
Sure, https://github.com/kevinzg/facebook-scraper/commit/ba26b7bf3a1a61dfd60246b64a37f5c5a2a492ae should solve this. Usage:
from facebook_scraper import _scraper
from facebook_scraper import *
set_cookies("cookies.txt")
posts = get_posts("dudukovich", pages=10, options={'allow_extra_requests': False})
for post in posts:
print(post["post_id"], post["likes"], post["reaction_count"], post["comments"])
print(f"Made {_scraper.request_count} requests")
This is fantastic. Thank you!
Using it, I've discovered that I have not been calculating requests correctly. I thought that
from facebook_scraper import _scraper
from facebook_scraper import *
set_cookies("cookies.txt")
posts = get_posts("dudukovich", pages=1, options={'allow_extra_requests': False})
for post in posts:
print(post["post_id"], post["likes"], post["reaction_count"], post["comments"])
print(f"Made {_scraper.request_count} requests")
(note that pages=1
) would only make two requests - one for the 'login' and one for getting/accessing the posts from one page of a profile, but the actual output is Made 3 requests
.
What are the three requests for?
The scraper tried to get /posts, failed, and fell back to /
Oh, in other words it first tries /posts/###, fails, and then tries /###? Is there a way I can avoid the first failure (and hence extra request)?
I now know that that code always takes at least two requests for me for any username.
Sure, https://github.com/kevinzg/facebook-scraper/commit/089688dbe6405683aa5f9c0c87cc0de04de47d55 will stop the scraper from trying /posts
Wow, that cuts down my requests by almost half. Thanks! (Do you happen to have any idea whether this kind of reduction in requests matters to Facebook with respect to soft bans?)
Many thanks for the _scraper.request_count
addition. This makes it so much easier to understand exactly what triggers new requests.
My own request calculations are much closer now but still off sometimes, and I'm trying to track down under what conditions my calculations diverge from reality. Are there other failed-request fallbacks when fetching profiles, posts, or responses?
Probably not. No problem. You can use enable_logging()
to see what requests are being made and when
You can use
enable_logging()
to see what requests are being made and when
Thanks, I should have realized that but didn't! 😊