scrape-it icon indicating copy to clipboard operation
scrape-it copied to clipboard

Alternative selectors for one element

Open jpujol880807 opened this issue 4 years ago • 1 comments

Hi, congratulations for the great work in this project. I have use case that makes a little difficult to use this package on my project. Let's suppose that we are scraping a website that implements A-B testings. I mean, some times they change their html a little bit. Then I would like to have alternative selectors for single keys, let's say I want to have something like this:

 ...
 content: {
    selectors: [ 
     {
        selector: ".article-content"
       , how: "html"
      },
      {
         selector:"#article-content",
         attr: "data-content"
      }
   ]
}
....

because sometimes the page I scrape may present the content inside the element with class article-content and other on the atrribute data-content of the element with id article-content. I would like to have both selectors and evaluate them in order in a way that if first selector fails I search on the second and so on. Is there a clean way of implementing these multiple selectors for a single item? If no, I think this could be a nice feature for the project.

jpujol880807 avatar Mar 03 '21 05:03 jpujol880807

This could be interesting to implement, however keeping the code simpler I would suggesting calling scrapeHTML on the response code twice: in the first call you detect what version of the page is being loaded and then you do the final scraping.

IonicaBizau avatar Apr 15 '21 12:04 IonicaBizau