django-dynamic-scraper
django-dynamic-scraper copied to clipboard
Custom HTTP status codes for DDS checker
Hi Holger!
First of all, thank you for this fantastic piece of software!
I'd love for DDS checkers to support custom HTTP status codes. For example, the website that I'm scraping is returning 302 codes for items that no longer exist. That would also mean checking for xpath values on redirected urls.
Looking forward to hearing your thoughts on that.
Hi, would be open for that, though I'm not having that much time atm to do the changes on short term (also sorry for the late answer).
After reflecting on this a couple of minutes, I think, following might be a good implementation:
- Replacing
404type withCUSTOM_HTTP_CODEtype - Replacing
404_OR_X_PATHtype withCUSTOM_HTTP_CODE_OR_XPATHtype - Adding a field
checker_custom_http_codewhich defaults to404
Like this backwards compatibility would also be kept (existing settings wouldn't break).
What do you think? Does this makes sense?
Thanks for the answer. Don't worry about short term changes, I'd rather have them solid instead of rushed.
If you're worried about backwards compatibility, maybe adding a CUSTOM_HTTP_CODE instead of replacing the 404 type makes more sense. Otherwise I think your suggestions sound perfect.