python-for-data-and-media-communication-gitbook
python-for-data-and-media-communication-gitbook copied to clipboard
selenium click button to select categories and search
Troubleshooting
Describe your environment
- Operating system:OS
- Python version:3
- Hardware:
- Internet access:
- Jupyter notebook or not? [Y/N]:Y
- Which chapter of book?:Week6
Describe your question
请问如何抓取浮窗内容呢?(是鼠标移动到某个位置才会出现的浮窗,如果将鼠标移开那个位置浮窗就消失了) 这种浮窗内容在F12的源码中是没有的,所以不知道怎么抓取…… Could someone helps me! Thanks so much! @hupili @ChicoXYC
The minimum code (snippet) to reproduce the issue
Describe the efforts you have spent on this issue
我百度过后,基本只有关于弹出窗口内容爬取的解决方案,没有找到相类似的解决方案
@ZhangNingNina can you give me the example code of what you want to scrape?
@ZhangNingNina can you give me the example code of what you want to scrape?
Sure. For example, the website here: https://www.zhipin.com/?sid=sem_pz_bdpc_dasou_title I was wondering how to scrape those industry categories on the left side of the webpage in details? It's hard to find out the source code because the detailed information appears only when i hover the cursor over it.
@ZhangNingNina hope this may solve your problem.https://github.com/ChicoXYC/exercise/blob/master/boss-%E7%9B%B4%E8%81%98/boss%E7%9B%B4%E8%81%98.ipynb
@ZhangNingNina hope this may solve your problem.https://github.com/ChicoXYC/exercise/blob/master/boss-%E7%9B%B4%E8%81%98/boss%E7%9B%B4%E8%81%98.ipynb
Thanks a lot!
学长,这个办法貌似不适用。我们遇到的困难是,某个信息需要鼠标移到某一点才出现信息框,比如:
但是,这个信息没有办法点击出来,网页代码也找不到诶
@ChicoXYC
@iiiJenny in that case, I think we can use another way to scrape.

the sub-categories' urls in one father category increase by integers. You can formate those urls.
Does this solve the hover issue: https://stackoverflow.com/a/8261754/2446356 ?
@ZhangNingNina have you solved the problem?
One solution is that you can format those sub-category links.
Like the above example:
the link of
java
is https://www.zhipin.com/c101010100-p100101/
and the link of c++
is https://www.zhipin.com/c101010100-p100102/
you can find that, only the last number is different, which indicates we can format all the urls by this method.
Also please let me know whether the above method @hupili gave worked or not. Thanks @ZhangNingNina
Sorry for my late reply. We've tried it and found this method worked. Thank you so much!!!