firecrawl [Feat] Be able to pass a timeout param to the endpoints

[Feat] Be able to pass a timeout param to the endpoints

Open nickscamara opened this issue 10 months ago • 9 comments

enable the user to pass a "timeout parameter" to both the scrape and the crawl endpoint. If the timeout is exceeded, please send the user a clear error message. On the crawl endpoint, return any pages that have already been scraped but include messages notifying them that the timeout was exceeded.

If the task is completed within two days, we'll include a $10 dollar tip :)

This is an intro bounty. We are looking for exciting people that will buy in so we can start to ramp up.

Apr 24 '24 22:04 nickscamara

@nickscamara Can I get assigned?

Apr 25 '24 06:04 ezhil56x

@ezhil56x all yours!

Apr 25 '24 06:04 nickscamara

@nickscamara Do we need a default timeout or not required?

Apr 25 '24 08:04 ezhil56x

Hi, Is this issue still open, or is someone working on it?

Jun 08 '24 20:06 parthusun8

@parthusun8, the issue is still open, but fixing it would need us to make some real complex changes to our bull queue system to allow the /crawl route to timeout. So far, we’ve found that stopping an active job in bull isn’t possible. This means we’d have to change the deepest parts of our system to add a timeout feature to Firecrawl.

Jun 10 '24 19:06 rafaelsideguide

@nickscamara should we close this for now?

Jul 01 '24 12:07 rafaelsideguide

Je peux être affecté in the work

Jul 05 '24 16:07 haija45

/attempt #59 My implementation plan 👍

In the scrape endpoint, we use the scrapeUrl function and pass the timeout value as an option. If the scrape operation times out, we catch the TimeoutError and return a JSON response with a status code of 408 (Request Timeout).

In the crawl endpoint, we use the crawlUrl function and pass the timeout value as an option. If the crawl operation times out, we catch the TimeoutError and return a JSON response with a status code of 408 (Request Timeout). We also add a message to each page in the response indicating that the crawl timed out.

Jul 31 '24 21:07 akay41024

@akay41024: Another person is already attempting this issue. Please don't start working on this issue unless you were explicitly asked to do so.

Jul 31 '24 21:07 algora-pbc[bot]

firecrawl firecrawl copied to clipboard

[Feat] Be able to pass a timeout param to the endpoints

firecrawl
firecrawl copied to clipboard