lychee-action icon indicating copy to clipboard operation
lychee-action copied to clipboard

Reddit links can not be checked? + Default list of exclusions

Open tooomm opened this issue 6 months ago • 4 comments

I found an older discussion on issue with checking Reddit links and posted my observation there too: https://github.com/lycheeverse/lychee/discussions/1324#discussioncomment-13402967

After adding a link to project related subreddit to a README file, it always returns an error on the link: [403] Network error: Forbidden

The link follows the usual pattern (https:/www.reddit.com/r/SUBREDDITNAME/) and does work just fine when putting it into the browser directly.

Can somebody confirms this or is aware


I had to add --exclude www.reddit.com to the args in lychee-action.

Reading https://github.com/lycheeverse/lychee-action/issues/53 then, I found out that e.g. twitter links are actually always excluded as false positives that would always fail: https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/filter/mod.rs#L34-L39

Should reddit also be listed there?


I also saw certain w3.org links that are also skipped automatically: https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/filter/mod.rs#L42-L52 But looking at e.g. https://www.w3schools.com/xml/schema_intro.asp, they actually list a different link not covered right now, which I circumvent by excluding www.w3.org in general: http://www.w3.org/2001/XMLSchema-instance

tooomm avatar Jun 15 '25 15:06 tooomm

It depends on the network connection. It works over here (at the moment):

echo 'https://www.reddit.com/r/rust/' | lychee -vvv -
     [200] https://www.reddit.com/r/rust/

🔍 1 Total (in 0s) ✅ 1 OK 🚫 0 Errors

Maybe they blocked the GitHub IP Address range?

mre avatar Jun 16 '25 12:06 mre

As for w3.org, could you create a PR? I think we should exclude that automatically.

mre avatar Jun 16 '25 12:06 mre

It depends on the network connection. It works over here (at the moment):

echo 'https://www.reddit.com/r/rust/' | lychee -vvv -
     [200] https://www.reddit.com/r/rust/

🔍 1 Total (in 0s) ✅ 1 OK 🚫 0 Errors

Maybe they blocked the GitHub IP Address range?

I did not try with lychee directly as I'm only using the lychee-action GitHub Action. That's also why I opened the ticket here and not at the lychee repo.

Can you confirm my issue when trying the same URL with the action? Maybe they do block GitHub IP's indeed? 🤔


As for w3.org, could you create a PR? I think we should exclude that automatically.

https://github.com/lycheeverse/lychee/pull/1735 😎

tooomm avatar Jun 16 '25 18:06 tooomm

Just tested using the lychee-action@master and can confirm:

[403] https://www.reddit.com/r/rust/ | Rejected status code (this depends on your "accept" configuration): Forbidden
# Summary
| Status        | Count |
|---------------|-------|
| 🔍 Total      | 1     |
| ✅ Successful | 0     |
| ⏳ Timeouts   | 0     |
| 🔀 Redirected | 0     |
| 👻 Excluded   | 0     |
| ❓ Unknown    | 0     |
| 🚫 Errors     | 1     |
## Errors per input
### Errors in README.md
* [403] <https://www.reddit.com/r/rust/> | Rejected status code (this depends on your "accept" configuration): Forbidden

Unfortunately, it looks like Reddit blocks GitHub (workflows) now. I also tried setting a different user-agent, but it didn't work.

Since it still works with the lychee binary, I see two options:

  • Add Reddit to a "global" list of exclusions, which gets used in lychee-action.
  • Do nothing and hope that GitHub will be unblocked.

What do you think?

mre avatar Jun 20 '25 13:06 mre

Any feedback?

mre avatar Aug 25 '25 10:08 mre

Just tested using the lychee-action@master and can confirm:

[403] https://www.reddit.com/r/rust/ | Rejected status code (this depends on your "accept" configuration): Forbidden
# Summary
| Status        | Count |
|---------------|-------|
| 🔍 Total      | 1     |
| ✅ Successful | 0     |
| ⏳ Timeouts   | 0     |
| 🔀 Redirected | 0     |
| 👻 Excluded   | 0     |
| ❓ Unknown    | 0     |
| 🚫 Errors     | 1     |
## Errors per input
### Errors in README.md
* [403] <https://www.reddit.com/r/rust/> | Rejected status code (this depends on your "accept" configuration): Forbidden

Unfortunately, it looks like Reddit blocks GitHub (workflows) now. I also tried setting a different user-agent, but it didn't work.

Since it still works with the lychee binary, I see two options:

  • Add Reddit to a "global" list of exclusions, which gets used in lychee-action.
  • Do nothing and hope that GitHub will be unblocked.

What do you think?

Thanks for answering to my question i had couple of hours ago. I get for most of the websites: Rejected status code (this depends on your "accept" configuration): Forbidden. So, I will need to wait for them to enable GitHub Actions to make requests, correct?

wmariuss avatar Sep 21 '25 21:09 wmariuss

Yes, that is the case. There's nothing we can do about it. Reddit would have to unblock GitHub IP ranges from making requests.

I don't know if or when this ban will be lifted.

mre avatar Sep 22 '25 09:09 mre

I don't think there is much to do in this issue anymore, so I'm closing it. In my opinion, we should not add Reddit to the global exclusion list, because requests still work on the command-line.

mre avatar Sep 22 '25 09:09 mre