NLP-with-Python icon indicating copy to clipboard operation
NLP-with-Python copied to clipboard

Newbie

Open LeoTRESPEUCH opened this issue 7 years ago • 6 comments

Hello Susanli, I tried to use your code but I received this error:

AttributeError                            Traceback (most recent call last)
<ipython-input-2-fede94e53ef8> in <module>
    211 
    212     # get all reviews for 'url' and 'lang'
--> 213     items = scrape(url, lang)
    214 
    215     if not items:

<ipython-input-2-fede94e53ef8> in scrape(url, lang)
     48 
     49 
---> 50     items = parse(session, url + '?filterLang=' + lang)
     51 
     52     return items

<ipython-input-2-fede94e53ef8> in parse(session, url)
     63         return
     64 
---> 65     num_reviews = soup.find('span', class_='reviews_header_count').text # get text
     66     num_reviews = num_reviews[1:-1]
     67     num_reviews = num_reviews.replace(',', '')

AttributeError: 'NoneType' object has no attribute 'text'

I'm university business professor but newbie with Python, could you help me to use your solution for scrap trip advisor hotel reviews ? Thanks in advance

LeoTRESPEUCH avatar Jan 23 '19 18:01 LeoTRESPEUCH

Hello Susanli.

I am getting the same error message. I have already checked that bs4 module is installed in my system.

Are we missing something here?

Thank you very much!

focaalvarez avatar Feb 20 '19 17:02 focaalvarez

replace line 65 with

num_reviews = soup.find('span', class_='hotels-hotel-review-community-content-TabBar__tabCount--37DbH').text # get text

note the class is changed to something else. I believe when she wrote it TA changed their website.

It was throwing a non type because you created a variable with Nothing inside then you tried using this nothing variable.

kyle10n avatar Mar 11 '19 10:03 kyle10n

LOL, this is a moving target. Try replacing the suspect line as follows:

# num_reviews = soup.find('span', class_='reviews_header_count').text # get text    
numSpan = soup.select('span[class*="hotels-hotel-review-community-content-TabBar__tabCount--"]')
num_reviews = numSpan[0].text # get text

keithweberrit avatar Apr 11 '19 06:04 keithweberrit

I had the same issue and I solved it by inspecting my target page's code. Probably the errors you get are requiring you to review the source code, which has changed.

silviasanasi avatar Apr 16 '19 10:04 silviasanasi

LOL, this is a moving target. Try replacing the suspect line as follows:

# num_reviews = soup.find('span', class_='reviews_header_count').text # get text    
numSpan = soup.select('span[class*="hotels-hotel-review-community-content-TabBar__tabCount--"]')
num_reviews = numSpan[0].text # get text

Last line is giving error again. List index out of range

shivaksh21 avatar Jun 01 '19 05:06 shivaksh21

I am getting this error while running the code on jupytor notebook.

TypeError Traceback (most recent call last) in () 7 for url in start_urls: 8 # get all reviews for 'url' and 'lang' ----> 9 items = scrape(start_urls, lang) 10 if not items: 11 print('No reviews')

in scrape(url, lang) 8 }) 9 ---> 10 items = parse(session, url+'?filterLang='+lang) 11 return items

TypeError: can only concatenate list (not "str") to list

rkmishracs avatar Jan 05 '20 07:01 rkmishracs