django-dynamic-scraper icon indicating copy to clipboard operation
django-dynamic-scraper copied to clipboard

Custom HTTP status codes for DDS checker

Open WarToaster opened this issue 8 years ago • 2 comments

Hi Holger!

First of all, thank you for this fantastic piece of software!

I'd love for DDS checkers to support custom HTTP status codes. For example, the website that I'm scraping is returning 302 codes for items that no longer exist. That would also mean checking for xpath values on redirected urls.

Looking forward to hearing your thoughts on that.

WarToaster avatar Jul 26 '17 14:07 WarToaster

Hi, would be open for that, though I'm not having that much time atm to do the changes on short term (also sorry for the late answer).

After reflecting on this a couple of minutes, I think, following might be a good implementation:

  • Replacing 404 type with CUSTOM_HTTP_CODE type
  • Replacing 404_OR_X_PATH type with CUSTOM_HTTP_CODE_OR_XPATH type
  • Adding a field checker_custom_http_code which defaults to 404

Like this backwards compatibility would also be kept (existing settings wouldn't break).

What do you think? Does this makes sense?

holgerd77 avatar Aug 08 '17 11:08 holgerd77

Thanks for the answer. Don't worry about short term changes, I'd rather have them solid instead of rushed.

If you're worried about backwards compatibility, maybe adding a CUSTOM_HTTP_CODE instead of replacing the 404 type makes more sense. Otherwise I think your suggestions sound perfect.

WarToaster avatar Aug 15 '17 16:08 WarToaster