sds2019 icon indicating copy to clipboard operation
sds2019 copied to clipboard

8.2.3 - Scrape links from category pages on Trustpilot

Open Choptdei opened this issue 5 years ago • 1 comments

Hi, I have written the following function, but I cant get it working. The lists 'reviews' and 'firmaer' is empty. The tages and classes should be right.

Can you help me?

`firmaer = [] reviews = []

def scraper(url): trin1 = requests.get(url) trin2 = BeautifulSoup(trin1.text, 'html.parser')

    firmaer.append(trin2.find_all('h3', {'class': 'category-business-card__header'}))
    temp_url = trin2.find_all('a', {'class': 'category-business-card card'})
    for review in temp_url:
        reviews.append(url + review['href'])

scraper('https://www.trustpilot.com/categories/social_club')`

Choptdei avatar Aug 19 '19 15:08 Choptdei

Maybe (probably) the HTML is dynamically generated in your browser, and thus requests wont receive html containing the correct elements. Have you looked for a json file containing the data you need?

kristianolesenlarsen avatar Aug 20 '19 08:08 kristianolesenlarsen