scrape-schema-recipe icon indicating copy to clipboard operation
scrape-schema-recipe copied to clipboard

Restrict Scraping Formats

Open scientes opened this issue 1 year ago • 0 comments

The current parsing is very inefficient if the format is known beforehand, because extruct still parses all other formats. As recpies are to my knowledge always microdata or json-ld, telling extruct that we only need to parse for microdata and json-ld speeds up scraping significantly.

related extruct issue: https://github.com/scrapinghub/extruct/issues/193

scientes avatar Nov 19 '23 12:11 scientes