crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

Fix: request `/crawl` with `stream: true` issue #1066

Open zinodynn opened this issue 7 months ago β€’ 0 comments

Summary

Fixes #1066 This PR fixes the error that happens when you send a request to /crawl with stream: true in the crawler_config. What’s the fix?

If a request has stream: true, it now automatically redirects (307) to the /crawl/stream endpoint.

No more async_generator errors

List of files changed and why

  1. deploy/docker/server.py
  • Added a check for stream: true in the /crawl request handler.
  • If detected, redirects to /crawl/stream instead of trying to process it in the wrong place.
  1. deploy/docker/static/playground/index.html
  • update stream configuration based on endpoint selection (/crawl :stream is False, /crawl/stream : stream is True ) image
  1. tests/docker/test_server_requests.py
  • add test for crawl endpoint with stream redirects

How Has This Been Tested?

Tested with curl(like the issue example) eg.

curl --location 'http://localhost:11235/crawl' \
--header 'Content-Type: application/json' \
--data '{
    "urls": [
        "https://example.com/page1",  
        "https://example.com/page2",
    ],
    "crawler_config": {
        "type": "CrawlerRunConfig",
        "params": {
            "scraping_strategy": {
                "type": "WebScrapingStrategy",
            },
            "stream": true
        }
    }
}'

Got a 307 redirect to /crawl/stream, and the streaming worked!

Checklist:

  • [x] My code follows the style guidelines of this project
  • [ ] I have performed a self-review of my own code
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [x] I have added/updated unit tests that prove my fix is effective or that my feature works
  • [x] New and existing unit tests pass locally with my changes

zinodynn avatar May 05 '25 10:05 zinodynn