facebook-scraper copied to clipboard
Extracting Page Review
Does the library support extracting reviews for pages? Even just the overall review for the page? It doesn't show up when using get_page_info
It does:
{'about': 'About\n'
'Corner Victoria and Federal Streets, Auckland, New Zealand 1010\n'
'Get Directions\n'
'Rating · 4.5\n'
'(2.6K reviews)\n'
'304,574 people checked in here\n'
'09-363 6000\n'
'[email protected]\n'
'Closed now\n'
'10:00 AM - 6:00 PM\n'
'Closed now\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'10:00 AM - 6:00 PM\n'
'Popular Hours\n'
"One of New Zealand's most exhilarating and spectacular tourist "
"A truly captivating experience awaits visitors to Auckland's Sky "
'Tower. At 328 metres, it is the tallest man-made structure in New '
'Zealand and offers breathtaking views for up to 80 kilometres in '
'every direction.\n'
'Travel up in the glass-fronted lifts to one of the three '
'spectacular viewing platforms, or for more thrills and excitement, '
'SkyWalk round the pergola at 192 metres up or SkyJump off the '
'Relax with a coffee and light refreshments at Sky Lounge or dine at '
"Orbit - Auckland's only 360-degree revolving restaurant.\n"
"Sky Tower is one of New Zealand's most exhilarating and spectacular "
'tourist attractions, you will be amazed at what you can see and do '
'under one roof!\n'
'Price Range · $$\n'
'Landmark & Historical Place\n'
'See more\n'
'See Less',
'checkins': 304574,
'likes': 68922,
'people_talking_about_this': 612}
Note the 'Rating · 4.5\n'
'(2.6K reviews)\n'
In the about
Oh cool! Thanks for this. Although I think it doesn't handle some cases such as this one. Here's a Facebook Page with 3 reviews but they are not seen in the About
But the resulting About looks like this
Suggest edits\n
1121 B Labores Street Pandacan, 1011 Manila, Philippines\n
Get Directions\n
84 people checked in here\n
0998 963 3587\n
Send message\n
Open now\n
9 AM - 9:30 PM\n
Open now\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
9 AM - 9:30 PM\n
Fresh, delicious, yummy, refreshing and affordable shake and juices only from Chamba Juice and Shake!!!\n
Price Range · $\n
Smoothie & Juice Bar\n
smoothies, milktea, juies\n
See more\n
See Less
Would a separate feature be needed for extracting the Reviews Tab?
I see - looks like chambajuice doesn't have an about page. This commit (https://github.com/kevinzg/facebook-scraper/commit/a516dfabff4b5937ef99ea25c84e463473a29e3d) should make get_page_info
extract the rating, under a new key called rating
. No need to raise a separate issue for the feature of extracting reviews, we can re-use this one
This commit (https://github.com/kevinzg/facebook-scraper/commit/e362c522dd500c3c91ffb858c6044fca3d4b4d9a) should make it possible to extract reviews. Sample usage:
for review in get_page_info("chambajuice")["reviews"]:
{'post_url': 'https://facebook.com/story.php?story_fbid=844190382691206&id=100013007553035&locale2=en_US&__tn__=%2As%2As',
'profile_picture': 'https://scontent.fakl8-1.fna.fbcdn.net/v/t1.6435-1/cp0/e15/q65/p40x40/176057649_1213006852476222_7349829092521007297_n.jpg?_nc_cat=100&ccb=1-5&_nc_sid=dbb9e7&_nc_ohc=mQwiEkVN55cAX-txl-s&_nc_ht=scontent.fakl8-1.fna&oh=00_AT_dqKC3Yhu2jYV9Pf4HJhJmn0yjOMoobEoajX5k4rpWfg&oe=6216E02B',
'recommends': True,
'text': 'good taste ang milktea. creamy',
'time': datetime.datetime(2019, 12, 31, 7, 31, 42),
'timestamp': 1577730702,
'user_url': 'https://facebook.com/app.bennok?locale2=en_US',
'username': 'Boy Montaos'}
{'post_url': 'https://facebook.com/story.php?story_fbid=4077043325658195&id=100000577028543&locale2=en_US&__tn__=%2As%2As',
'profile_picture': 'https://scontent.fakl8-1.fna.fbcdn.net/v/t39.30808-1/cp0/e15/q65/p40x40/252319237_5083768481652336_441345146184154296_n.jpg?_nc_cat=103&ccb=1-5&_nc_sid=dbb9e7&_nc_ohc=tr5R8QAt6-QAX-fEtyM&_nc_ht=scontent.fakl8-1.fna&oh=00_AT-P6GENcwW8sFQv1v1rnFmUbCeZYPwfUx0zXi9sK7bNiQ&oe=61F3BBC3',
'recommends': True,
'text': 'Super Affordable and yummy. napaka bilis pa nang service and '
'delivery. 😊👍',
'time': datetime.datetime(2020, 12, 9, 2, 27, 39),
'timestamp': 1607434059,
'user_url': 'https://facebook.com/hannahniah.lim?locale2=en_US',
'username': 'Hananiah Fermin Lim'}
{'post_url': 'https://facebook.com/story.php?story_fbid=3178356272178102&id=100000112820909&locale2=en_US&__tn__=%2As%2As',
'profile_picture': 'https://scontent.fakl8-1.fna.fbcdn.net/v/t39.30808-1/cp0/e15/q65/p40x40/218824795_6398471983499832_3335334123648518092_n.jpg?_nc_cat=104&ccb=1-5&_nc_sid=dbb9e7&_nc_ohc=sJ5901n2oWwAX8qVkQO&_nc_ht=scontent.fakl8-1.fna&oh=00_AT_VL2g9-9jqNkGWNfJOZJnxX9ejqgBsGMrFNdZ-wI_yCg&oe=61F38F86',
'recommends': True,
'text': 'ok naman. patamisin lang ng konti yung pearl 😊',
'time': datetime.datetime(2019, 6, 4, 0, 53, 25),
'timestamp': 1559566405,
'user_url': 'https://facebook.com/yzhanyzhi?locale2=en_US',
'username': 'Yazmine C J Bautista'}
This page profile has review generator object but it throws out Content Not Found
Below is the code when scraping the page using get_page_info
from facebook_scraper import *
from pprint import pprint
set_user_agent("Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")
profile = get_page_info('atebeyandsell')
Below here is the output
{'about': 'About\n'
'Send message\n'
'Entrepreneur · Gaming Video Creator\n'
'See all',
'address': None,
'followers': 7964,
'identifier': 107313684786461,
'image': None,
'likes': 6616,
'name': 'Ate Bey and Sell',
'profile_photo': 'https://scontent.fmnl4-3.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164655260_107316611452835_8528683889612977791_n.jpg?_nc_cat=110&ccb=1-5&_nc_sid=ed5ff1&efg=eyJpIjoidCJ9&_nc_eui2=AeH1BAfJPhyOlrPCVM-i5RSMckwgHk9sgqFyTCAeT2yCoRsyBrnuYXXkf8OdF8DXgEHEC2SHH_Dx7Ks7cSHtfxxq&_nc_ohc=WOM3rj6xiC0AX9Ct3Y1&_nc_ht=scontent.fmnl4-3.fna&oh=00_AT9WN2G_WwPbs8fQFtp1ho1EK9kQdfE9EK6q-WaB5ixntQ&oe=6232C265',
'rating': 'Entrepreneur',
'reviews': <generator object FacebookScraper.get_page_reviews at 0x7f9cb1346430>,
'sameAs': 'instagram.com/atebeyofficial',
'type': 'Person',
'url': 'https://www.facebook.com/atebeyandsell/'}
From the output, it can be seen that there is a generator object for the reviews
key. However, when trying to access it using the code below
for i in profile['reviews']:
It throws the following error
NotFound Traceback (most recent call last)
/var/folders/25/k79djfcj737dwxvhtkr192zr8p8x5p/T/ipykernel_15908/4199099289.py in <module>
----> 1 for i in profile['reviews']:
2 print(i)
~/opt/anaconda3/envs/sample/lib/python3.8/site-packages/facebook_scraper/facebook_scraper.py in get_page_reviews(self, page, **kwargs)
521 while more_url:
522 logger.debug(f"Fetching {more_url}")
--> 523 response = self.get(more_url)
524 if response.text.startswith("for (;;);"):
525 prefix_length = len('for (;;);')
~/opt/anaconda3/envs/sample/lib/python3.8/site-packages/facebook_scraper/facebook_scraper.py in get(self, url, **kwargs)
805 if title:
806 if title.text.lower() in not_found_titles:
--> 807 raise exceptions.NotFound(title.text)
808 elif title.text.lower() == "error":
809 raise exceptions.UnexpectedResponse("Your request couldn't be processed")
NotFound: Content Not Found
The reviews aren't accessible at https://m.facebook.com/pg/atebeyandsell/reviews/ either. This page must have reviews disabled.
The problem might be with m.facebook.com The mobile version cannot open some urls
The reviews aren't accessible at https://www.facebook.com/atebeyandsell/reviews either