ci: add tests done via local webserver
To preserve the idea from here: https://github.com/lycheeverse/lychee/pull/1733#issuecomment-2984529983
lychee invokes different modules and code when checking local files, as file:// URLs, compared to remote files via HTTP. Currently it is between these two source code files:
- https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/checker/file.rs
- https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/checker/website.rs
In some cases, e.g. with fragment checking enabled, they behave with subtile differences, hence it would be good to more, if not all URL checker tests, additionally via remote URL/website checker. That way we can assure both behave as expected, and aim for consistency where needed.
lychee tests currently do, reasonably, most tests with the local file checker. There is a small number of tests done using remote URLs, pointing to the lychee GitHub repository. While it works for the main repo, it can break for forks, e.g. when implementing changes for the main project, while the main project changes something among its test files. Also changes on test files on a fork or even just a future branch break tests done on the hardcoded main branch of the main repo. And actual remote HTTP requests also cause additional traffic, take more time, and may fail for local or remote network issues, not related to problems in lychee code or test files.
To allow adding more remote URL checker tests without (CI) regression, best would be to spin up and check against a locally running webserver, which serves the test files from the local Git repository via HTTP with expected MIME type and headers.
There are very light Rust file "server" crates, which however may lack certain headers, like the MIME type/content-type, but a fully blown dedicated Nginx or similar might be overkill. A good compromise, with seems easy to setup via tokio, might be warp, based on hyper. A particular check could be wrapped with an individual instance like this:
let input = fixtures_path().join("fragments");
// Spin up webserver to serve local files via HTTP
let webserver = tokio::spawn(async {
warp::serve(warp::fs::dir(input))
.run(([127, 0, 0, 1], 1337))
.await;
});
// URL checks done here ...
// Stop webserver and wait for it to finish
webserver.abort();
let _ = webserver.await;
Generally would be better to start it once before/for all tests. But currently not sure how to achieve this best. Maybe a function which checks whether the server is already running, and starts it only if not. Or is there some kind of test init feature?