facebook_page_scraper How do you close the HTTP client so I can scrape multiple sites?

How do you close the HTTP client so I can scrape multiple sites?

Open davidkurtenb opened this issue 1 year ago • 3 comments

I'm trying to scrape multiple facebook pages in a loop but even if the page name changes in the loop it's pulling posts from the first page?

May 30 '23 03:05 davidkurtenb

Can you share the code that you're trying for better understanding?

Jun 03 '23 07:06 shaikhsajid1111

Absolutely, appreciate any help you can give as I've been running this manually having to restart the kernel for each new scrape.

import facebook_page_scraper
import pandas as pd

PATH = mypath

sample_lst = ['BigBendNPS','YosemiteNPS', 'RockyNPS']

def scrap_facebook(page_nm):
    page_name = page_nm
    posts_count = 15
    browser = "chrome"
    timeout = 600 #600 seconds
    headless = True
    park_scrap = facebook_page_scraper.Facebook_scraper(page_name, 
                                                        posts_count, 
                                                        browser, 
                                                        timeout=timeout, 
                                                        headless=headless)
    csv_data = park_scrap.scrap_to_csv(PATH+page_nm)

for p in sample_lst:
    print(f'Scraping Facebook Page Name: {p}')
    scrap_facebook(p)

Then when I pull it back in.

big_bend_df = pd.read_csv(PATH+'\\Facebook Comment ScraperBigBendNPS.csv')
yosemite_df = pd.read_csv(PATH+'\\Facebook Comment ScraperYosemiteNPS.csv')
rockymnt_df = pd.read_csv(PATH+'\\Facebook Comment ScraperRockyNPS.csv')

print(f'big_bend_df exact match to yosemite_df: {big_bend_df.equals(yosemite_df)}')
print(f'big_bend_df exact match to rockymnt_df: {big_bend_df.equals(rockymnt_df)}')

OUTPUT

big_bend_df exact match to yosemite_df: True big_bend_df exact match to rockymnt_df: True

Jun 03 '23 10:06 davidkurtenb

Okay, this issue happened earlier as well. Here #45. Thanks for reporting. Currently, I don't have much clue about the reason but will take this as a bug for now

Jun 04 '23 09:06 shaikhsajid1111

facebook_page_scraper facebook_page_scraper copied to clipboard

How do you close the HTTP client so I can scrape multiple sites?

facebook_page_scraper
facebook_page_scraper copied to clipboard