Problem of scraping number data from Dianping.com
Troubleshooting
Describe your environment
- Operating system: MacOS 10.13.6
- Python version: 3.6
- Hardware: MacBook Pro 13-inch, 2017,
- Jupyter notebook or not? [Y/N]: Y
Describe your question
I cannot get the number of the tag attributes from Dianping website
The minimum code (snippet) to reproduce the issue
from bs4 import BeautifulSoup
url= 'http://www.dianping.com/chengdu/ch10/g34060r1577'
browser = webdriver.Chrome()
browser.get(url)
h = browser.find_element_by_css_selector('html')
t = h.get_attribute('innerHTML')
mypage = BeautifulSoup(t)
dianping_list = []
h = mypage.find('div', attrs={'class': 'content'})
i = h.find_all('div',attrs={'class':'txt'})
remark = i[0].find('div',attrs={'class':'comment'}).find('a',attrs={'class':'review-num'})
remark.b```
link: https://github.com/zacharyzeng/Bug_Centre/blob/master/dianping.ipynb
The solution is here:
https://github.com/hupili/python-for-data-and-media-communication/blob/master/scraper-selenium/dianping%20comment%20number.ipynb
The sample data is here:
https://github.com/hupili/python-for-data-and-media-communication/blob/master/scraper-selenium/dianping.csv
This case is too much beyond our curriculum. However, it is also good that you bring it up. The demo code may not work on your side directly. You need to study my logics and revise the decoder table and decode function accordingly. The way of analysis is more important than the result.
p.s. This issue is an excellent demo of efficiently asking questions.
The solution is here:
https://github.com/hupili/python-for-data-and-media-communication/blob/master/scraper-selenium/dianping%20comment%20number.ipynb
The sample data is here:
https://github.com/hupili/python-for-data-and-media-communication/blob/master/scraper-selenium/dianping.csv
This case is too much beyond our curriculum. However, it is also good that you bring it up. The demo code may not work on your side directly. You need to study my logics and revise the
decodertable anddecodefunction accordingly. The way of analysis is more important than the result.p.s. This issue is an excellent demo of efficiently asking questions.
Thanks Pili, I will try to learn and decode it.