spider icon indicating copy to clipboard operation
spider copied to clipboard

Allow passing a parser as closuer when a route is matched

Open Dragnucs opened this issue 3 years ago • 2 comments

Making this a draft for now until I add more documentation and testing.

This basically adds a method to execute a closure so that we can start parsing pages before the completion of the whole crawl.

Many websites are big enough that crawling will take hours while we need a quick access to crawled data. Also, if anything goes wrong, we lose all the efforts spend.

This PR depends on #9. A rebase might be needed after the former is merged.

Dragnucs avatar Apr 10 '21 13:04 Dragnucs

Ahh nice, I started working on this here https://github.com/madeindjs/spider/pull/13

j-mendez avatar Feb 11 '22 02:02 j-mendez

There is a alight difference. My PR adds a list of callbacks.

If you manage to write tests for it. It would be perfect.

Thanks.

On February 11, 2022 3:05:07 AM GMT+01:00, Jeff @.***> wrote:

Ahh nice, I started working on this here https://github.com/madeindjs/spider/pull/13

-- Reply to this email directly or view it on GitHub: https://github.com/madeindjs/spider/pull/10#issuecomment-1035737210 You are receiving this because you authored the thread.

Message ID: @.***> Cordialement / Regards, Touhami https://touha.me

Dragnucs avatar Feb 11 '22 09:02 Dragnucs

@Dragnucs We are shifting the branch target to main which will close the PR. Feel free to put it up again on main, thanks!

j-mendez avatar Sep 24 '22 21:09 j-mendez