facebook-scraper icon indicating copy to clipboard operation
facebook-scraper copied to clipboard

Is it possible to get last_post_id from get_profile()?

Open Vanguard-52236 opened this issue 2 years ago • 4 comments

Since we are doing a request to get the profile, is there a way to capture the last post_id from the already loaded page? instead of doing a second request with get_posts() just to check to see if there are any new posts?

Vanguard-52236 avatar May 05 '22 18:05 Vanguard-52236

Sure, https://github.com/kevinzg/facebook-scraper/commit/bd7690306d219d15ddc0fdb397ae488de031a923 should do it

neon-ninja avatar May 11 '22 00:05 neon-ninja

This works great!

I'm wondering, since you you need to do an extra request anyways to get the whole post, we might as well include the whole thing as top_post? You're thoughts?

        if kwargs.get("allow_extra_requests", True):
            logger.debug(f"Requesting page from: {account}")
            response = self.get(account)
            top_post = response.html.find(
                '[data-ft*="top_level_post_id"]:not([data-sigil="m-see-translate-link"])',
                first=True,
            )
            result["top_post"] = PostExtractor(
                top_post, kwargs, self.get
            ).extract_post()

Vanguard-52236 avatar May 21 '22 15:05 Vanguard-52236

Provided a PR, if you're okay with it.

#761

Vanguard-52236 avatar May 21 '22 15:05 Vanguard-52236

Sure, https://github.com/kevinzg/facebook-scraper/commit/e4e1390c4186a0d23270a9d5908f5c7705514203 should do it. Removing source is pretty easy, top_post.pop("source") or del top_post["source"] can do it

neon-ninja avatar May 23 '22 02:05 neon-ninja