browsertrix
browsertrix copied to clipboard
[Feature]: Add checkbox to workflow editor to fetch robots.txt and respect disallows
trafficstars
Description
Related to https://github.com/webrecorder/browsertrix-crawler/issues/631
Once this feature is released in the crawler (PR: https://github.com/webrecorder/browsertrix-crawler/pull/888), we'll want to add the corresponding option to the Browsertrix UI.
We may want to set a minimum crawler version for this feature, similar to min_seed_file_crawler_image and min_autoclick_crawler_image
Requirements
- Checkbox option for adding
--robotsflag to crawler args - Configurable minimum crawler version check in backend
Context
No response