Scraping-TripAdvisor-with-Python-2020 icon indicating copy to clipboard operation
Scraping-TripAdvisor-with-Python-2020 copied to clipboard

Extract Tripadvisor reviews from a specific page with Google Colab

Open biagioscalingipsy opened this issue 2 years ago • 0 comments

Hi Giuseppe! I premise that I am a very novice user of Python, and for the moment, I am using Google Colab to perform some operations. In particular, I am trying to extract the reviews on TripAdvisor at this link:(https://www.tripadvisor.it/Attraction_Review-g2173026-d8059630-Reviews-Bungee_Jumping_Asiago_Enego_Foza_175_metri-Foza_Province_of_Vicenza_Veneto.html).

I tried several attempts using BeautifulSoup: import requests from bs4 import BeautifulSoup as soup

import requests from bs4 import BeautifulSoup as soup

import requests from bs4 import BeautifulSoup as soup

URL della pagina di TripAdvisor

url = 'https://www.tripadvisor.it/Attraction_Review-g2173026-d8059630-Reviews-Bungee_Jumping_Asiago_Enego_Foza_175_metri-Foza_Province_of_Vicenza_Veneto.html'

Effettua la richiesta HTTP per ottenere il contenuto della pagina

html = requests.get(url) bsobj = soup(html.content, 'html.parser')

Trova tutti i tag 'q' che contengono le recensioni

reviews = [] for r in bsobj.findAll('q'): reviews.append(r.span.text.strip()) print(r.span.text.strip())

Stampa le recensioni estratte

for review in reviews: print(review)`

The code seems to work, but the runtime is too long and eventually crashes because of a large idle time on Colab (I even tried inserting an automatic click to avoid the timeout, but it doesn't work).

After that, I tried following your script but when I run: driver = webdriver.Safari() I get this error: "Exception: SafariDriver was not found; are you using Safari 10 or later? You can download Safari from https://developer.apple.com/safari/download/".

The point is that I have the latest version of Safari (version 16.5.1), and I also checked the Safari Development section "Allow remote automation". How do you think I can download the reviews into a txt file or put them into a dataframe?

Thank you in advance.

biagioscalingipsy avatar Jul 10 '23 09:07 biagioscalingipsy