hydra-link-checker icon indicating copy to clipboard operation
hydra-link-checker copied to clipboard

Possibly add end-to-end testing on a real site as part of build process

Open stevezieglerva opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. This is not a problem with the script as is, but more of an issue with building confidence for developers that nothing major was broken. This can be important for a complex process like link checking that can easily have unseen issues given the multi-threaded, high volume of transactions.

Describe the solution you'd like Execute a script as part of the test process (somehow in GitHub Actions?) that runs Hydra against a known test website(s) and then compares the Hydra output to a text file of expected results (at least compare the front matter). This also could be used to time performance to enure changes to make it slower. It could also be used to check configuration settings, outside of unit tests, like testing if performance is slower with only using 1 thread.

I created sites like this when trying to build my own link checker: https://github.com/stevezieglerva/lnkchk_test_sites

It could be done in a shell script to avoid trying to shoehorn into a unit test framework.

Describe alternatives you've considered I don't have alternatives. I realize that this introduces some external dependencies into the build process which may not be desirable.

Additional context

stevezieglerva avatar Sep 13 '20 13:09 stevezieglerva

Hi Steve!

The current tests (run by GitHub Actions on PRs) check Hydra against a local HTML file to ensure it's behaving as expected. I considered using a live website (and even creating a small app that returned the requested HTTP response codes, but I digress). In the end I've stuck with the local page as I think it avoids the potential problem of accidentally testing your Internet connection instead of Hydra.

Given the nature of multithreading, as you mention, this doesn't address the issue of helping users to tune and configure Hydra to their use case. Perhaps it's a matter of presentation -- rather than providing a live site for "testing Hydra," providing a live site with expected return values could help users to tune and configure Hydra. As you said, this can be helpful in deciding how many threads to use or what length of timeouts to set in order to account for things like Internet speed, site rate limiting, and desired runtime.

I'm certainly not opposed to the concept, but will need some convincing as to implementation.

victoriadrake avatar Sep 24 '20 00:09 victoriadrake

This makes sense. The script, tests, and CI with Actions are beautifully simple and self-contained right now. Future developers could use known test sites for their own local testing prior to PRs.

szieglerICF avatar Sep 24 '20 13:09 szieglerICF