`--accept` doesn't accept the same values as `--cache-exclude-status`
--cache-exclude-status <CACHE_EXCLUDE_STATUS> A list of status codes that will be ignored from the cache The following exclude range syntax is supported: [start]..[[=]end]|code. Some valid examples are: - 429 (excludes the 429 status code only) - 500.. (excludes any status code >= 500)
looks like this should work
$ lychee -- --config tools/lychee/config.toml --cache --cache-exclude-status '500..' .
warning: Git tree '/home/katexochen/edgelesssys/contrast' is dirty
error: invalid value '500..' for '--cache-exclude-status <CACHE_EXCLUDE_STATUS>': failed to parse accept range: no range pattern found
For more information, try '--help'.
error: Recipe `check-links` failed on line 385 with exit code 2
$ lychee -- --version
lychee 0.18.1
Thank for reporting this. You are right, there is a mismatch between the docs and the implementation. ~~The tests in lychee_lib::types::status_code::selector::StatusCodeSelector do not contain the expressions mentioned in the docs and the implementation is missing them as well.~~ (actually does in lychee_lib::types::accept::range::AcceptRange)
This seems to affect the following forms: 100.., ..100, ..=100
For now you can use `500..=999` as a workaround to represent `500..`
..100
This actually seem to work as intended.
Huh interesting. I've tested it with the --accept flag then. This flag seems to behave differently than --cache-exclude-status but I think it shouldn't. I've now had a closer look and found with git bisect that in e13f7c9fef32923982d8eaac7111b199b5ce2196 (see https://github.com/lycheeverse/lychee/pull/1668) your problem has been fixed. ~~Since your might be on Nix you are probably using v0.18.1 via nixpkgs which doesn't include the fix yet.~~ (Ah just realised that you updated lychee in nixpkgs, thanks a lot 👍)
So in summary, updating to lychee v0.19.0 will fix this issue. However, --accept still doesn't work as advertised and behaves behaves differently compared to --cache-exclude-status. @HadrienG2 Do you know why this is the case?
Hmm, that's interesting. What I can quickly tell is that...
- #1668 modified
AcceptRange. --acceptmaps into aStatusCodeSelector, whereas--cache-exclude-statusmaps into aStatusCodeExcluderStatusCodeSelectorwraps aVec<AcceptRange>and itsFromStrimpl simply expects a non-empty comma-delimited list ofAcceptRange, which it parses and hands over toSelf::new_from()StatusCodeExcluderalso wraps aVec<AcceptRange>, and itsFromStrimpl is mostly the same except for accepting empty lists. Again, the output goes toSelf::new_from().
So far, this suggests that there is something special about StatusCodeSelector::new_from(). But to my eye they look identical. So I'm not 100% sure what's going on here...
Ok, I got it. The actual error is on current master is...
$ cargo run -- --accept=300.. https://github.com/lycheeverse/lychee/issues/1769#issuecomment-3084022958
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.18s
Running `target/debug/lychee --accept=300.. 'https://github.com/lycheeverse/lychee/issues/1769#issuecomment-3084022958'`
Error: invalid status code
Stack backtrace:
0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
at /home/hadrien/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/anyhow-1.0.98/src/backtrace.rs:27:14
1: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
at /home/hadrien/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:2050:27
2: lychee::client::create
at ./lychee-bin/src/client.rs:20:41
...where the matching source code line is...
let accepted: HashSet<StatusCode> = cfg.accept.clone().try_into()?;
As it turns out, the logic for converting AcceptRange to HashSet is not a pure copy-paste between StatusCodeSelector and StatusCodeExcluder, as is the case for most code in these structs (this begs for some refactor someday...). For StatusCodeSelector, that logic is:
impl<S: BuildHasher + Default> TryFrom<StatusCodeSelector> for HashSet<StatusCode, S> {
type Error = http::status::InvalidStatusCode;
fn try_from(value: StatusCodeSelector) -> Result<Self, Self::Error> {
<HashSet<u16>>::from(value)
.into_iter()
.map(StatusCode::from_u16)
.collect()
}
}
...and this looping logic will eventually encounter an u16 value that is not a valid HTTP status code while iterating over the unbounded AcceptRange.
I can think of a few ways to fix this:
- Make
AcceptRangeonly produce valid HTTP status codes. Those are in the 100..1000 range, whereas right now an unboundedAcceptRangestarts at0if no left bound is specified, and goes up tou16::MAXif no right bound is specified. - Make the conversion from
StatusCodeSelectortoHashSet<StatusCode>exclude theu16produced by the conversion toHashSet<u16>that do not belong to the valid HTTP status code range. - Don't use a
HashSet<StatusCode>internally, juste manipulate the ranges directly. For a typical small amount of ranges, this should actually be faster (two integer comparisons per range vs a full hash map lookup).
@HadrienG2 Thanks for the analysis. I might try to fix it if I find the time 👍
Make AcceptRange only produce valid HTTP status codes.
My vote goes to this. If we make invalid state unrepresentable, we can avoid it propagating across the codebase. We tackle the issue at the root.