qidian.com - can't scrap the number on the website (special fonts)
Troubleshooting
Describe your environment
- Operating system:
- Python version:
- Hardware:
- Internet access:
- Jupyter notebook or not? [Y/N]: Y
- Which chapter of book?:
Describe your question
I can't scrap the number about how many words the novel have. The url: https://www.qidian.com/all?chanId=2&subCateId=5&size=1&action=0&orderId=&vip=0&month=3&update=1&style=1&pageSize=20&siteid=1&pubflag=0&hiddenField=0&page=1

The minimum code (snippet) to reproduce the issue
import requests
from bs4 import BeautifulSoup
url = 'https://www.qidian.com/all?chanId=2&subCateId=5&size=1&action=0&orderId=&vip=0&month=3&update=1&style=1&pageSize=20&siteid=1&pubflag=0&hiddenField=0&page=1'
r=requests.get(url)
mypage=BeautifulSoup(r.text)
mypage
import json
json.dumps(a[45].find('span').text)
json.dumps(a[48].find('span').text)

They use a special font to display the numbers. However, those characters are not regular numbers.
Need to find a way to "decode" numbers.
This is too hard for our students. Here's the quick solution. It is better to study with some other students together:
https://github.com/hupili/python-for-data-and-media-communication/blob/master/scraper-examples/Qidian%20wordcount.ipynb
Thank you!
do i need to install something here?
Could you tell me what special modules you used? Thank you!
wget is not a module. It is a Linux/ Unix command. You need to search how to install this tool on your operating system.