scraper icon indicating copy to clipboard operation
scraper copied to clipboard

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless cli...

Results 12 scraper issues
Sort by recently updated
recently updated
newest added

current [notes](https://github.com/get-set-fetch/scraper/blob/main/development.md) are incomplete

When running against selectors like ` [ ".classA > .classB", ".classA > .classC" ] ` getSelectorBase will return an invalid selector: `"classA > "`. Don't rely (just) on space to...

right now SchemaType generating a typescript type based on a json-schema doesn't support the "required" field. both required and optional json-schema properties result in optional typescript properties. translate required json-schema...

bug

right now all acceptance test perform scraping with the builtin plugins. add new test defining a custom plugin with a node_modules dependency in order to test plugin bundling.

as plugins and plugin options change, versioning will warn about possibly no longer valid scrape definitions

start/stop scraping multiple projects at the same time

add a ClickPlugin responsible for performing javascript click actions

- scrape resource by resource - scrape in parallel