crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: Injected JavaScript Not Executed Properly by Crawler

Open Harinib-Kore opened this issue 6 months ago β€’ 0 comments

crawl4ai version

0.6.2

Expected Behavior

Js code should be executed by accepting cookies in page.

Current Behavior

  1. The crawler is failing to execute custom JavaScript injected via js_code, which is intended to interact with elements on the page (e.g., accepting cookies). Even after injecting valid JS, the behavior is not as expected β€” the button is not being clicked, and the cookie prompt remains. Is there other settings by which i can accept cookies and crawl these pages. I am facing issues while crawling pages like this.

  2. We are still encountering duplicate URLs during deep crawling. Additionally, even when an explicit content-type filter is applied with text/html , PDF files are still being crawled.

cc: @unclecode @aravindkarnam @ntohidi

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce


Code snippets

import asyncio
from crawl4ai import AsyncWebCrawler
from crawl4ai.async_configs import CrawlerRunConfig

async def main():
    js_code = """
    setTimeout(() => {
        const acceptButton = document.querySelector('a.wscrOk2');
        if(acceptButton) {
            console.log('Found accept button - clicking...');
            acceptButton.click();
            setTimeout(() => {
                console.log('Cookies should be accepted now');
            }, 1000);
        } else {
            console.log('Accept button not found');
        }
    }, 2000);
    """

    config = CrawlerRunConfig(
        js_code=js_code,
        scan_full_page=True,
        check_robots_txt=False,
        verbose=True,
    )

    async with AsyncWebCrawler() as crawler:
        result = await crawler.arun(
            url="https://benefits.workday.com/uk/eap",
            config=config
        )
        print(result.markdown)

if __name__ == "__main__":
    asyncio.run(main())

output -

We place cookies on your device to enable this site to work, to enhance your user experience and to improve our services. Some cookies we use are necessary for the site to work, while others are used to help us manage and improve the site and the services we offer you. If you’re happy to opt-in to our use of cookies just click the "Accept all cookies" button.
[Necessary cookies only](https://benefits.workday.com/uk/eap)[Accept all cookies](https://benefits.workday.com/uk/eap)
[Review our use of cookies and set your preferences](https://benefits.workday.com/uk/eap)
Our website uses cookies to distinguish you from other users of our website. This helps us to provide you with a good experience when you browse our website and also allows us to improve our site. 
This Cookie Policy sets out the

OS

linux

Python version

3.9.7

Browser

linux

Browser version

131.0.6778.139

Error logs & Screenshots (if applicable)

No response

Harinib-Kore avatar May 18 '25 18:05 Harinib-Kore