fink
fink copied to clipboard
Add option to dump response content to a folder
I want to use fink not only to check for false response codes. I want also to dump its response content (in my case html) to a file because I want then to use the w3c html5validator to validate this files.
Before investigating into implementing this I would to check if you are open to add this option.
maybe. My main concern would be that while dumping the HTML is easy enough, it takes you half-way to creating an offline version of a given site (dumping assets etc, then relatiivising URLs), and that's a new problem.
I think Fink could also be used as a library, and you could f.e.do
$dispatcher = DispatcherBuilder::create('http://www.example.com')->callback(function (Response $response) {
// your stuff here
})->build();
but you probably want to use this via. the command, so that would require more refactoring (and itherwise you would currently have to bootstrap the event loop stuff).
Note that there is also https://github.com/spatie/crawler which might be more suitable for your use case? looking at the Validator library I guess it makes sense to have a tool which just dumps the HTML.
Not against the idea necessarily, it would be convenient, but actually not 100% sure it belongs here (it could fit though)
If we did, I guess the Crawler
should be refactored to extract the DOM parsing into an observer, the code to dump the HTML can then also be an observer, and the Crawler will only send notifications when it gets a Response and has read the $body
.
If we did, I guess the
Crawler
should be refactored to extract the DOM parsing into an observer, the code to dump the HTML can then also be an observer, and the Crawler will only send notifications when it gets a Response and has read the$body
.
Throwing an Event sounds like a good idea for me and would make it very flexible: Would you use the symfony/event-dispatcher for this or which library do you prefer?
No, no libraries for this :) We can simply create an interface for the observer (e.g. CrawlerObserver
) and pass a collection of these to the crawler (e.g. CrawlerObservers
).
I think this will require some refactoring, still thinking about how to do it...