lychee icon indicating copy to clipboard operation
lychee copied to clipboard

Do not check links that contain wildcards in CSP rules

Open jiazengp opened this issue 2 years ago • 10 comments

jiazengp avatar Apr 30 '22 12:04 jiazengp

Hey @jiazengp, can you provide an example?

mre avatar Apr 30 '22 13:04 mre

As a workaround you could exclude these yourself for now by adding --exclude '\*'.

mre avatar Apr 30 '22 13:04 mre

嘿,@jiazengp,你能举个例子吗?

For example

<meta http-equiv="Content-Security-Policy" content="default-src 'none'; base-uri 'none'; connect-src https:; font-src 'self' data: https://at.alicdn.com https://*.yuanshen.site https://*.google.com;
  | form-action 'self' https://support.qq.com; frame-src 'self'; img-src 'self' data: https: https://cdn.jsdelivr.net;
  | manifest-src 'self'; media-src 'self' https://cdn.jsdelivr.net; script-src 'unsafe-eval' 'unsafe-hashes' 'unsafe-inline' 'self' https://esm.run https://polyfill.io https://b.alicdn.com https://*.clarity.ms/ https://hm.baidu.com https://cdn.jsdelivr.net https://widget.qweather.net https://*.baidu.com https://widget.heweather.net https://cdn.heweather.com/ https://webapi.amap.com https://restapi.amap.com https://*.googletagmanager.com https://*.google-analytics.com;
  | style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net https://widget.heweather.net; worker-src 'self' blob:;">

image

jiazengp avatar Apr 30 '22 15:04 jiazengp

Yeah that is indeed an issue. I think we should move the issue to the lychee repository, which is where we maintain the Rust code that lychee-action uses. In fact we should clarify with the team behind linkify, which is the plaintext link extractor we use. I'm not sure if this should count as a valid link (but it might very well be).

mre avatar Apr 30 '22 16:04 mre

Some clarification: the asterisk character is valid and serves as a wildcard operator, which stands for any number of characters. So https://foo.example.com and https://bar.example.com would both match https://*.example.com. The question is if URLs with wildcard operators should count as valid links as per the definition of linkify. I've opened an issue for clarification whether this is something we should handle on our end or inside linkify: https://github.com/robinst/linkify/issues/37.

mre avatar May 01 '22 12:05 mre

* is a valid character for both host and path.

lebensterben avatar May 01 '22 13:05 lebensterben

things like https://*.yuanshen.site should be wrapped in pre tags since it's never intended to be a URL, but an example.

lebensterben avatar May 01 '22 13:05 lebensterben

The meta tag that OP posted is valid (according to https://wiki.mozilla.org/Security/CSP/Specification) and not wrapped in a pre tag, though. We could ignore meta tags, but I think we shouldn't.

mre avatar May 01 '22 18:05 mre

I also think so

jiazengp avatar May 08 '22 10:05 jiazengp

Work is done in linkify to fix this issue: https://github.com/robinst/linkify/pull/43

mre avatar Jul 01 '22 10:07 mre

This is fixed now, thanks to the awesome work by @robinst in https://github.com/robinst/linkify/pull/43. I'm no longer getting any wildcard URLs in your example:

✗ [404] https://b.alicdn.com/ | Failed: Network error: Not Found
✗ [403] https://widget.heweather.net/ | Failed: Network error: Forbidden
✗ [403] https://at.alicdn.com/ | Failed: Network error: Forbidden
✗ [403] https://widget.qweather.net/ | Failed: Network error: Forbidden
✗ [ERR] https://cdn.heweather.com/ | Failed: Network error: dns error: no record found for Query

@jiazengp, I'm not sure if this is still a problem for you, but if it is, the latest version should have the fix.

mre avatar Jul 08 '23 22:07 mre