crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: mean_delay does not work with CrawlerRunConfig

Open nguyenthengocdev opened this issue 1 month ago • 1 comments

crawl4ai version

0.7.6

Expected Behavior

When i set mean_delay, it should be delayed between requests

Current Behavior

It ignores mean_delay config

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce


Code snippets

await crawler.arun_many(
            urls=[
                "https://docs.crawl4ai.com/core/examples/",
                "https://docs.crawl4ai.com/core/quickstart/",
            ],
            config=CrawlerRunConfig(
                stream=False, mean_delay=10.0, max_range=5.0, semaphore_count=1
            )
        )

OS

macOs

Python version

3.13

Browser

Chrome

Browser version

No response

Error logs & Screenshots (if applicable)

Config:

config=CrawlerRunConfig(
                stream=False, mean_delay=10.0, max_range=5.0, semaphore_count=1
            )

Result:

[INIT].... → Crawl4AI 0.7.6 
[FETCH]... ↓ https://docs.crawl4ai.com/core/examples/                                                             | ✓ | 
⏱: 3.63s 
[SCRAPE].. ◆ https://docs.crawl4ai.com/core/examples/                                                             | ✓ | 
⏱: 0.03s 
[COMPLETE] ● https://docs.crawl4ai.com/core/examples/                                                             | ✓ | 
⏱: 3.66s 
[FETCH]... ↓ https://docs.crawl4ai.com/core/quickstart/                                                           | ✓ | 
⏱: 1.63s 
[SCRAPE].. ◆ https://docs.crawl4ai.com/core/quickstart/                                                           | ✓ | 
⏱: 0.02s 
[COMPLETE] ● https://docs.crawl4ai.com/core/quickstart/                                                           | ✓ | 
⏱: 1.65s 
Elapsed time: 3.70 seconds
https://docs.crawl4ai.com/core/examples/ crawled OK!
https://docs.crawl4ai.com/core/quickstart/ crawled OK!

nguyenthengocdev avatar Nov 04 '25 10:11 nguyenthengocdev

in arun_many:

        config = config or CrawlerRunConfig()
        # if config is None:
        #     config = CrawlerRunConfig(
        #         word_count_threshold=word_count_threshold,
        #         extraction_strategy=extraction_strategy,
        #         chunking_strategy=chunking_strategy,
        #         content_filter=content_filter,
        #         cache_mode=cache_mode,
        #         bypass_cache=bypass_cache,
        #         css_selector=css_selector,
        #         screenshot=screenshot,
        #         pdf=pdf,
        #         verbose=verbose,
        #         **kwargs,
        #     )

        if dispatcher is None:
            dispatcher = MemoryAdaptiveDispatcher(
                rate_limiter=RateLimiter(
                    base_delay=(1.0, 3.0), max_delay=60.0, max_retries=3
                ),
            )

seems like the rate limit has been moved to the dispatcher param. if you're calling arun_many directly, maybe try creating the dispatcher and passing it with the delay you want?

dispatcher = SemaphoreDispatcher(
    max_session_permit=1,
    rate_limiter=RateLimiter(
        base_delay=(8.0, 12.0), max_delay=60.0, max_retries=3
    ),
)

await crawler.arun_many(
    urls=[
        "https://docs.crawl4ai.com/core/examples/",
        "https://docs.crawl4ai.com/core/quickstart/",
    ],
    config=CrawlerRunConfig(
        stream=False
    ),
    dispatcher=dispatcher
)

I do think it's a bit misleading with the config options still available though. They're probably just there for backwards compatibility, but they should probably be removed in the future or at least unpacked to dynamically set those dispatcher values to prevent confusion.

jtanningbed avatar Nov 26 '25 02:11 jtanningbed