facebook-scraper
facebook-scraper copied to clipboard
`get_posts_by_search` has no return
Hello, guys! I'm having trouble using get_posts_by_search
to get posts. Here's my code:
import facebook_scraper as fb
fb.set_cookies("cookies.txt")
keywords = "nintendo"
for post in fb.get_posts_by_search(keywords, pages=10, options={"comments": True, "reactors": True, "allow_extra_requests": True}):
print(post['text'])
It returns no errors but also no posts. Only a few times it can work as expected but it really confuses me. Do you know how to fix it? Very appreciated for your reply!
I am having the same issue. Running get posts_by_search doesn't seem to find any posts. My code is similar to @KKHYA.
from facebook_scraper import get_posts, get_posts_by_search
for post in get_posts_by_search('cameroon', cookies='cookies2.txt', extra_info=False, pages=10, options={'comments': True, 'posts_per_page':10}):
print('test')
I hope this isn't a major problem to fix. I find this repo very helpful! Thanks!
Can someone help with this issue? Really don't know why I can't get any posts by this function.
I uninstalled and reinstalled the latest master branch:
pip install git+https://github.com/kevinzg/facebook-scraper.git
Then I tried to scrape using a keyword I know there must be a lot of tweets for, plus I updated my cookies file.
for post in get_posts_by_search('biden', cookies='cookies3.txt', extra_info=False, pages=5):
print(post)
Unfortunately, I still do not get any output from this.
I ran this in Google Colab just in case this was an issue in my environment. I didn't solve this, but I do get an additional warning that is informative at least.
WARNING:facebook_scraper.page_iterators:No raw posts (<article> elements) were found in this page.
It seems that the get_posts_by_search is no longer returning anything. Is anyone else having this issue? I wish I could be more helpful, but I'm still learning about all this. Thank you!
I encountered the same issue, I add these to my code and solved the problem, it's not the solution for a long time, but I still hope it helps.
page_iterators.py
class SearchPageParser(PageParser):
cursor_regex = re.compile(r'href[:=]"[^"]+(/search/[^"]+)"')
cursor_regex_2 = re.compile(r'href":"[^"]+(/search/[^"]+)"')
def get_page(self) -> Page: //add
return super()._get_page('article', 'article')
def get_next_page(self) -> Optional[URL]:
if self.cursor_blob is not None:
match = self.cursor_regex.search(self.cursor_blob)
if match:
return match.groups()[0]
match = self.cursor_regex_2.search(self.cursor_blob)
if match:
value = match.groups()[0]
return value.encode('utf-8').decode('unicode_escape').replace('\\/', '/')
I came across same situation. I modifed source code below, then solved. I suppose that Facebook's site codes changed. Hope it helps.
class SearchPageParser(PageParser):
cursor_regex = re.compile(r'href[:=]"[^"]+(/search/[^"]+)"')
cursor_regex_2 = re.compile(r'href":"[^"]+(/search/[^"]+)"')
def get_page(self) -> Page: // add
return super()._get_page('div[data-module-role="TOP_PUBLIC_POSTS"]', 'article') // add
def get_next_page(self) -> Optional[URL]:
if self.cursor_blob is not None:
match = self.cursor_regex.search(self.cursor_blob)
if match:
return match.groups()[0]
match = self.cursor_regex_2.search(self.cursor_blob)
if match:
value = match.groups()[0]
return value.encode('utf-8').decode('unicode_escape').replace('\\/', '/')
@yangsu10yen It works! Sorry for replying too late. But here's another problem. get_posts_by_search
ends after getting 9 posts. Do you know why it happens?
@yangsu10yen It works! Sorry for replying too late. But here's another problem.
get_posts_by_search
ends after getting 9 posts. Do you know why it happens?
did you update the code and re-installed the package? I still get 0 results even after changing the source code
@yangsu10yen It works! Sorry for replying too late. But here's another problem.
get_posts_by_search
ends after getting 9 posts. Do you know why it happens?did you update the code and re-installed the package? I still get 0 results even after changing the source code
@belhajManel Yes, I forked the source code and update it. Then, I installed my updated package and it shows 9 posts. But it still shows only 9 posts now.
@KKHYA Sorry for the delay in replying.
I came across same situation. However, I did not get 9 posts, but no posts.
After investigating the cause, it seems that the tag to find the page container (by PageParse.get_page()) has changed from <article>
to <div>.
So applying following changes, we were able to get the posts successfully. Hope this help you.
index cbe0b59..2cead68 100644
--- a/facebook_scraper/page_iterators.py
+++ b/facebook_scraper/page_iterators.py
@@ -143,7 +143,7 @@ class PageParser:
def get_page(self) -> Page:
# Select only elements that have the data-ft attribute
- return self._get_page('article[data-ft*="top_level_post_id"]', 'article')
+ return self._get_page('article[data-ft*="top_level_post_id"], div[data-ft*="top_level_post_id"]', 'article')
def get_raw_page(self) -> RawPage:
return self.html