python-for-data-and-media-communication-gitbook icon indicating copy to clipboard operation
python-for-data-and-media-communication-gitbook copied to clipboard

How to scrape all urls in one article

Open MindyZHAOMinzhu opened this issue 5 years ago • 2 comments

My environment

  • Device/ Operating System: ios
  • Python version: python3
  • Which chapter of book?: week07 notebook

My question

I study the chapter 7 of the python notebook. When I learn how to scrape all urls in one article, I cannot understand the code in the notebook. And I find there is no further explanation about why we use this code to solve the problem. 屏幕快照 2019-07-28 下午2 41 01

Describe the efforts you have spent on this issue

Google/Try to find some useful video online But I do not know how to search for related answers. And I cannot find post__title-link in the chrome developer. I am confused about it.

MindyZHAOMinzhu avatar Jul 28 '19 06:07 MindyZHAOMinzhu

post__title-link is an attribute in the website you scrawl, which is hidden in the html, you need chrome developer tool to check it.

find_all is a function in requests library, when you feel confused for the modules, the best way is to search the solution in google, such as this link, because all the problems you face definitely are solved by other developers.

If you still cannot understand the usage of this module, then just search the official documents or publish the issue in GitHub.

ConnorLi96 avatar Jul 30 '19 14:07 ConnorLi96

Thank you a lot!!! I get it!!! At first, I did not find the proper attribute in the html. It seems I did not go through the right website!! I will try to find solution when I face the problem again.

MindyZHAOMinzhu avatar Jul 30 '19 22:07 MindyZHAOMinzhu