newspaper4k icon indicating copy to clipboard operation
newspaper4k copied to clipboard

extract tags from breadcrumb

Open AndyTheFactory opened this issue 2 years ago • 0 comments

Issue by Ennoriel Tue Mar 5 20:25:27 2019 Originally opened as https://github.com/codelucas/newspaper/issues/685


Hello there, I suggest that when no tags has been found on the page, it tries to find a breadcrumb and extract the elements for tags. An example:

<ul class="breadcrumb">
  <li class="breadcrumb__parent">
    <a class="logo__societe logo__societe--article" href="/societe/">Société</a>
  </li> 
  <li class="breadcrumb__child">
    <a class="logo__prisons logo__prisons--article" href="/prisons/">Prisons</a>
  </li>
</ul>

It could be:

if ul (or other?) with "breadcrumb" class or id:
  last html children containing a tag, with href attribute and text

If I have some times, I'll have a try to implement it.

AndyTheFactory avatar Oct 24 '23 14:10 AndyTheFactory