megalinter icon indicating copy to clipboard operation
megalinter copied to clipboard

Add `User-Agent` in link checking

Open andrewvaughan opened this issue 5 months ago • 9 comments

Is your feature request related to a problem? Please describe. Many websites block (403 response) requests without a User-Agent HTTP request header set. This causes link checkers to automatically fail.

For https://github.com/tcort/markdown-link-check/ a proper issue has already been raised (https://github.com/tcort/markdown-link-check/issues/172); however, given that, for one, this issue is now almost 3-years old without a response, and, for two, it's better for each individual client to provide their unique User-Agent to be a good netizen, I recommend having MegaLinter provide a versioned User-Agent in their default configurations.

Describe the solution you'd like Add the following to the default https://github.com/oxsecurity/megalinter/blob/main/TEMPLATES/.markdown-link-check.json configuration, but also to any other link-checking linters that may exist:

{
  // ... existing config ...

  "httpHeaders": [
    {
      "urls": ["http", ".", "/"],
      "headers": {
        "User-Agent": "Mozilla/5.0 (compatible; markdown-link-check/3.11.2; MegaLinter/7.8.0; +https://megalinter.io)"
      }
    }
  ]

For more information on User-Agent header best practices and why I recommend the above: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent

Describe alternatives you've considered

  • Add places like stackoverflow.com to my ignore list.
  • Cry

Additional context

My MegaLinter right now:

Please?

andrewvaughan avatar Jan 21 '24 20:01 andrewvaughan