crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: Error: BrowserContext.new_page: Target page, context or browser has been closed

Open fcoury opened this issue 7 months ago • 7 comments

crawl4ai version

0.6.2

Expected Behavior

crawl4ai-doctor completed

Current Behavior

fcoury@m3pro ~/c/crawltest on  master [!] via 🐍 v3.12.9 (crawltest)
> $ crawl4ai-setup
[INIT].... → Running post-installation setup...
[INIT].... → Installing Playwright browsers...
[COMPLETE] ● Playwright installation completed successfully.
[INIT].... → Starting database initialization...
[COMPLETE] ● Database initialization completed successfully.
[COMPLETE] ● Post-installation setup completed!

fcoury@m3pro ~/c/crawltest on  master [!] via 🐍 v3.12.9 (crawltest) took 6s
> $ crawl4ai-doctor
[INIT].... → Running Crawl4AI health check...
[INIT].... → Crawl4AI 0.6.2
[TEST].... ℹ Testing crawling capabilities...
[ERROR]... × https://crawl4ai.com                               | Error:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ × Unexpected error in _crawl_web at line 528 in wrap_api_call (.venv/lib/python3.12/site-                             │
│ packages/playwright/_impl/_connection.py):                                                                            │
│   Error: BrowserContext.new_page: Target page, context or browser has been closed                                     │
│   Browser logs:                                                                                                       │
│                                                                                                                       │
│   <launching> /Users/fcoury/Library/Caches/ms-playwright/chromium-1161/chrome-                                        │
│ mac/Chromium.app/Contents/MacOS/Chromium --disable-field-trial-config --disable-background-networking --disable-      │
│ background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad  │
│ --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-component-    │
│ update --no-default-browser-check --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-feat  │
│ ures=AcceptCHFrame,AutoExpandDetailsElement,AvoidUnnecessaryBeforeUnloadCheckSync,CertificateTransparencyComponentUp  │
│ dater,DeferRendererTasksAfterInput,DestroyProfileOnBrowserClose,DialMediaRouteProvider,ExtensionManifestV2Disabled,G  │
│ lobalMediaControls,HttpsUpgrades,ImprovedCookieControls,LazyFrameLoading,LensOverlay,MediaRouter,PaintHolding,ThirdP  │
│ artyStoragePartitioning,Translate --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection   │
│ --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --force-color-profile=srgb       │
│ --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-  │
│ autorun --export-tagged-pdf --disable-search-engine-choice-screen --unsafely-disable-devtools-self-xss-warnings       │
│ --enable-use-zoom-for-dsf=false --headless --hide-scrollbars --mute-audio --blink-                                    │
│ settings=primaryHoverType=2,availableHoverTypes=2,primaryPointerType=4,availablePointerTypes=4 --no-sandbox           │
│ --disable-gpu --disable-gpu-compositing --disable-software-rasterizer --no-sandbox --disable-dev-shm-usage --no-      │
│ first-run --no-default-browser-check --disable-infobars --window-position=0,0 --ignore-certificate-errors --ignore-   │
│ certificate-errors-spki-list --disable-blink-features=AutomationControlled --window-position=400,0 --disable-         │
│ renderer-backgrounding --disable-ipc-flooding-protection --force-color-profile=srgb --mute-audio --disable-           │
│ background-timer-throttling --window-size=1280,720 --disable-background-networking --disable-backgrounding-occluded-  │
│ windows --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-   │
│ pages --disable-default-apps --disable-extensions --disable-features=TranslateUI --disable-hang-monitor --disable-    │
│ popup-blocking --disable-prompt-on-repost --disable-sync --metrics-recording-only --password-store=basic --use-mock-  │
│ keychain --user-data-dir=/var/folders/tf/g_d8fd0j319gf8vhw3rt7yc40000gn/T/playwright_chromiumdev_profile-y7jUQN       │
│ --remote-debugging-pipe --no-startup-window                                                                           │
│   <launched> pid=23887                                                                                                │
│   [pid=23887][err] [0428/203725.475336:WARNING:process_memory_mac.cc(94)] mach_vm_read(0x100fe8000, 0x8000):          │
│ (os/kern) invalid address (1)                                                                                         │
│   [pid=23887][err] [0428/203725.475843:WARNING:process_memory_mac.cc(94)] mach_vm_read(0x100fe8000, 0x8000):          │
│ (os/kern) invalid address (1)                                                                                         │
│   [pid=23887][err] [0428/203725.476076:WARNING:process_memory_mac.cc(94)] mach_vm_read(0x100fe8000, 0x8000):          │
│ (os/kern) invalid address (1)                                                                                         │
│   [pid=23887][err] [0428/203725.476258:WARNING:process_memory_mac.cc(94)] mach_vm_read(0x100fe8000, 0x8000):          │
│ (os/kern) invalid address (1)                                                                                         │
│   [pid=23887][err] [0428/203725.476420:WARNING:process_memory_mac.cc(94)] mach_vm_read(0x100fe8000, 0x8000):          │
│ (os/kern) invalid address (1)                                                                                         │
│   [pid=23887][err] [0428/203725.476563:WARNING:process_memory_mac.cc(94)] mach_vm_read(0x100fe8000, 0x8000):          │
│ (os/kern) invalid address (1)                                                                                         │
│   [pid=23887][err] [0428/203725.540368:WARNING:crash_report_exception_handler.cc(235)] UniversalExceptionRaise:       │
│ (os/kern) failure (5)                                                                                                 │
│                                                                                                                       │
│   Code context:                                                                                                       │
│   523           parsed_st = _extract_stack_trace_information_from_stack(st, is_internal)                              │
│   524           self._api_zone.set(parsed_st)                                                                         │
│   525           try:                                                                                                  │
│   526               return await cb()                                                                                 │
│   527           except Exception as error:                                                                            │
│   528 →             raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None                          │
│   529           finally:                                                                                              │
│   530               self._api_zone.set(None)                                                                          │
│   531                                                                                                                 │
│   532       def wrap_api_call_sync(                                                                                   │
│   533           self, cb: Callable[[], Any], is_internal: bool = False                                                │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

[ERROR]... × Error closing context: BrowserContext.close: Target page, context or browser has been closed
[ERROR]... × ❌ Test failed: Failed to get content

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce


Code snippets


OS

macOS Sequoia 15.4.1

Python version

3.12.9

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

fcoury avatar Apr 28 '25 23:04 fcoury

More things I tried. Created this file:

import asyncio
from playwright.async_api import async_playwright


async def main():
    async with async_playwright() as p:
        print("Launching browser...")
        try:
            # Try launching Chromium specifically
            browser = await p.chromium.launch(
                headless=True
            )  # Match headless mode used by crawl4ai
            print("Browser launched successfully.")
            context = await browser.new_context()
            print("Context created.")
            page = await context.new_page()
            print("Page created.")
            print("Navigating to example.com...")
            await page.goto("https://example.com")
            print("Navigation successful.")
            content = await page.content()
            print("Got content successfully.")
            # print(content[:100]) # Optional: print snippet
            await browser.close()
            print("Browser closed.")
        except Exception as e:
            print(f"An error occurred: {e}")
            # Attempt to close browser if it exists and failed mid-way
            if "browser" in locals() and browser.is_connected():
                await browser.close()


if __name__ == "__main__":
    asyncio.run(main())

And:

fcoury@m3pro ~/c/crawltest on  master [!] via 🐍 v3.12.9 (crawltest)
> $ uv run test_playwright.py
Launching browser...
Browser launched successfully.
Context created.
Page created.
Navigating to example.com...
Navigation successful.
Got content successfully.
Browser closed.

fcoury avatar Apr 28 '25 23:04 fcoury

I have the same issue let me know when you solve it. it mostly works but sometime like that

itsklimov avatar Apr 29 '25 16:04 itsklimov

I have the same issue

ftballguy45 avatar May 05 '25 20:05 ftballguy45

@fcoury Can you share a code sample here. More importantly, are you using multiple instances of AsyncWebCrawler in one go?

aravindkarnam avatar May 09 '25 07:05 aravindkarnam

@fcoury Can you share a code sample here. More importantly, are you using multiple instances of AsyncWebCrawler in one go?

What do you mean code sample? Did you read my initial post? I am using the crawl4ai-doctor that comes with the library.

fcoury avatar May 10 '25 15:05 fcoury

I managed to fix this by unsetting the env var DYLD_LIBRARY_PATH. This was causing Playwright to load the wrong binaries and libraries.

fcoury avatar May 15 '25 15:05 fcoury

@fcoury Thanks for sharing this! We've been having a hard time to reproduce this bug. Will check if we can update the code to automatically detect and make this correction, if not add a troubleshooting step in docs.

aravindkarnam avatar May 16 '25 07:05 aravindkarnam

@aravindkarnam just a recommand maybe it because you trans session_id in from dispacher to crawler.arun result = await self.crawler.arun(url, config=config, session_id=task_id) but arun did not trans it to async_crawler_stratege?

bigbrother666sh avatar Jun 06 '25 01:06 bigbrother666sh

Issue resolved in 0.7.2. Reopen this, if you are still facing this issue.

aravindkarnam avatar Aug 05 '25 09:08 aravindkarnam