lychee icon indicating copy to clipboard operation
lychee copied to clipboard

Partial link exclusion: check status code but skip fragments for certain sites

Open TrebledJ opened this issue 4 months ago • 5 comments

In some instances, it may be helpful to verify a site returns an expected status code, but skip fragment-checking. This is useful for sites which parse the fragment through JS.

One example is GitHub's line-number https://github.com/lycheeverse/lychee/blob/73dff8f56fae16b5ac47e3e2eac1bbb15cd52332/.gitignore#L1. While it may be desirable to verify the fragment exists by checking whether the line number is available, such behaviour will be dependent on the site and will need to be tediously implemented in lychee on a case-by-case basis. Instead, a workaround could be for the user to specify domains/URLs to skip fragment checking. Hence, this feature request.

Perhaps a special exclusion syntax could be added?

TrebledJ avatar Aug 10 '25 14:08 TrebledJ

Interesting. If we end up implementing it, we need a clean syntax.

The workaround is to run lychee twice, one for inputs where fragments should be included and one for inputs where they shouldn't.

mre avatar Aug 10 '25 22:08 mre

Faced the same need but for GitHub comments or README headings (.../README.md#heading or .../issues/1789#issuecomment-xxx)

StrikerRUS avatar Sep 09 '25 14:09 StrikerRUS

I haven't tested this, but another workaround might be to remap away the fragments in the links where you don't want to check them. This would look something like

--remap '(https://github\.com/lycheeverse/lychee/blob/[^#]+)#L\d+$ $1

If you're already using remaps, be aware that at most one remap can apply per URL.

katrinafyi avatar Sep 09 '25 17:09 katrinafyi

@katrinafyi Thanks a lot for the neat workaround! I used

remap = [
    '(?P<host>^https://github\.com)/(?P<path>.*)#(?P<anchor>.*)$ $host/$path/',
]

for my purposes.

StrikerRUS avatar Sep 11 '25 00:09 StrikerRUS

Remaps are really powerful and a bit weird. 😆

mre avatar Sep 11 '25 09:09 mre