Parse only a certain directory
Issue by savandriy
Thu Jul 7 18:20:03 2016
Originally opened as https://github.com/codelucas/newspaper/issues/267
Hi!
I started using newspaper library and ran into a problem. For example, I want to parse a certain category of a website. I would try to do it like this:
newspaper.build('http://www.kyivpost.com/technology/')
But it would give me all of the articles from that website. And in the documentation I didn't find any settings to set what categories to parse. In the code of the library I saw functions like set_categories and download_categories. Should I use them in some way? I want to get only articles from this category.
Comment by Neileruaa
Sun Mar 21 13:31:00 2021
Hey ! Did you find a solution to your problem ? I am also looking for a solution...