facebook_page_scraper
facebook_page_scraper copied to clipboard
How do you close the HTTP client so I can scrape multiple sites?
I'm trying to scrape multiple facebook pages in a loop but even if the page name changes in the loop it's pulling posts from the first page?
Can you share the code that you're trying for better understanding?
Absolutely, appreciate any help you can give as I've been running this manually having to restart the kernel for each new scrape.
import facebook_page_scraper
import pandas as pd
PATH = mypath
sample_lst = ['BigBendNPS','YosemiteNPS', 'RockyNPS']
def scrap_facebook(page_nm):
page_name = page_nm
posts_count = 15
browser = "chrome"
timeout = 600 #600 seconds
headless = True
park_scrap = facebook_page_scraper.Facebook_scraper(page_name,
posts_count,
browser,
timeout=timeout,
headless=headless)
csv_data = park_scrap.scrap_to_csv(PATH+page_nm)
for p in sample_lst:
print(f'Scraping Facebook Page Name: {p}')
scrap_facebook(p)
Then when I pull it back in.
big_bend_df = pd.read_csv(PATH+'\\Facebook Comment ScraperBigBendNPS.csv')
yosemite_df = pd.read_csv(PATH+'\\Facebook Comment ScraperYosemiteNPS.csv')
rockymnt_df = pd.read_csv(PATH+'\\Facebook Comment ScraperRockyNPS.csv')
print(f'big_bend_df exact match to yosemite_df: {big_bend_df.equals(yosemite_df)}')
print(f'big_bend_df exact match to rockymnt_df: {big_bend_df.equals(rockymnt_df)}')
OUTPUT
big_bend_df exact match to yosemite_df: True big_bend_df exact match to rockymnt_df: True
Okay, this issue happened earlier as well. Here #45. Thanks for reporting. Currently, I don't have much clue about the reason but will take this as a bug for now