crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: Error: Browser.new_context: Target page, context or browser has been closed." [ERROR]

Open tdiprima opened this issue 7 months ago β€’ 3 comments

crawl4ai version

0.6.0

Expected Behavior

The crawler should either:

  1. Successfully crawl the page and convert to markdown, or
  2. Provide a clear error message about the browser automation failure

Current Behavior

The command fails with error: 'NoneType' object has no attribute 'raw_markdown'
[ERROR]... Γ— Error closing context: BrowserContext.close: Target page, context or browser has been closed

crwl crawl https://www.nbcnews.com/business -o markdown  # or any URL

Is this reproducible?

Yes

Inputs Causing the Bug

Additional Context

  1. The target website is accessible (verified via curl)
  2. The html2text package is installed and available
  3. Playwright browsers are installed
  4. The issue persists with explicit configuration of markdown generator and browser type
  5. The error occurs because the browser automation fails silently, and the code attempts to process non-existent content

Suggested Fix

  1. Improve error handling when browser automation fails
  2. Provide clear error messages about browser automation failures instead of NoneType errors
  3. Add proper validation of content before attempting markdown conversion

Steps to Reproduce

Run...

crwl crawl https://www.nbcnews.com/business -o markdown  # or any URL

OS

macOS 15.4.1

Python version

3.9

Browser

Chrome

Browser version

136.0.7103.93

Error logs & Screenshots (if applicable)

crwl crawl https://www.nbcnews.com/business -o markdown  # or any URL
Error: 'NoneType' object has no attribute 'raw_markdown'
[ERROR]... Γ— Error closing context: BrowserContext.close: Target page, context or browser has been closed

tdiprima avatar May 09 '25 17:05 tdiprima

Running crawl4ai-doctor failed. "Error: Browser.new_context: Target page, context or browser has been closed." [ERROR]... Γ— ❌ Test failed: Failed to get content

crawl4ai-doctor
[INIT].... β†’ Running Crawl4AI health check...
[INIT].... β†’ Crawl4AI 0.5.0.post6
[TEST].... β„Ή Testing crawling capabilities...
[ERROR]... Γ— https://crawl4ai.com... | Error: 
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Γ— Unexpected error in _crawl_web at line 528 in wrap_api_call                                                         β”‚
β”‚ (../../../../../../usr/local/anaconda3/lib/python3.9/site-packages/playwright/_impl/_connection.py):                  β”‚
β”‚   Error: Browser.new_context: Target page, context or browser has been closed                                         β”‚
β”‚   Browser logs:                                                                                                       β”‚
β”‚                                                                                                                       β”‚
β”‚   <launching> /Users/me/Library/Caches/ms-playwright/chromium-1169/chrome-                                      β”‚
β”‚ mac/Chromium.app/Contents/MacOS/Chromium --disable-field-trial-config --disable-background-networking --disable-      β”‚
β”‚ background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad  β”‚
β”‚ --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-component-    β”‚
β”‚ update --no-default-browser-check --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-feat  β”‚
β”‚ ures=AcceptCHFrame,AutoExpandDetailsElement,AvoidUnnecessaryBeforeUnloadCheckSync,CertificateTransparencyComponentUp  β”‚
β”‚ dater,DeferRendererTasksAfterInput,DestroyProfileOnBrowserClose,DialMediaRouteProvider,ExtensionManifestV2Disabled,G  β”‚
β”‚ lobalMediaControls,HttpsUpgrades,ImprovedCookieControls,LazyFrameLoading,LensOverlay,MediaRouter,PaintHolding,ThirdP  β”‚
β”‚ artyStoragePartitioning,Translate --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection   β”‚
β”‚ --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --force-color-profile=srgb       β”‚
β”‚ --metrics-recording-only --no-first-run --password-store=basic --use-mock-keychain --no-service-autorun --export-     β”‚
β”‚ tagged-pdf --disable-search-engine-choice-screen --unsafely-disable-devtools-self-xss-warnings --enable-use-zoom-     β”‚
β”‚ for-dsf=false --no-sandbox --app=data:text/html, --window-size=600,600 --window-position=1020,10 --test-type=         β”‚
β”‚ --user-data-dir=/var/folders/47/6nl3w5n91ql2msklj729p6cr0000gn/T/playwright_chromiumdev_profile-4fDEuE --remote-      β”‚
β”‚ debugging-pipe about:blank                                                                                            β”‚
β”‚   <launched> pid=68954                                                                                                β”‚
β”‚   [pid=68954][err] [0510/103720.154041:WARNING:third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94]   β”‚
β”‚ mach_vm_read(0x10aaf5000, 0x2000): (os/kern) invalid address (1)                                                      β”‚
β”‚   [pid=68954][err] [0510/103720.155624:WARNING:third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94]   β”‚
β”‚ mach_vm_read(0x10aaf5000, 0x2000): (os/kern) invalid address (1)                                                      β”‚
β”‚   [pid=68954][err] [0510/103720.155939:WARNING:third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94]   β”‚
β”‚ mach_vm_read(0x10aaf5000, 0x2000): (os/kern) invalid address (1)                                                      β”‚
β”‚   [pid=68954][err] [0510/103720.156193:WARNING:third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94]   β”‚
β”‚ mach_vm_read(0x10aaf5000, 0x2000): (os/kern) invalid address (1)                                                      β”‚
β”‚   [pid=68954][err] [0510/103720.156421:WARNING:third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94]   β”‚
β”‚ mach_vm_read(0x10aaf5000, 0x2000): (os/kern) invalid address (1)                                                      β”‚
β”‚   [pid=68954][err] [0510/103720.156656:WARNING:third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94]   β”‚
β”‚ mach_vm_read(0x10aaf5000, 0x2000): (os/kern) invalid address (1)                                                      β”‚
β”‚   [pid=68954][err]                                                                                                    β”‚
β”‚ [0510/103720.286523:WARNING:third_party/crashpad/crashpad/handler/mac/crash_report_exception_handler.cc:235]          β”‚
β”‚ UniversalExceptionRaise: (os/kern) failure (5)                                                                        β”‚
β”‚   [pid=68954][err] [68961:6456809:0510/103720.289840:ERROR:base/memory/shared_memory_switch.cc:264] Mach rendezvous   β”‚
β”‚ failed, terminating process (parent died?)                                                                            β”‚
β”‚   [pid=68954][err] [68962:6456811:0510/103720.289843:ERROR:base/memory/shared_memory_switch.cc:264] Mach rendezvous   β”‚
β”‚ failed, terminating process (parent died?)                                                                            β”‚
β”‚   [pid=68954][err] [68963:6456813:0510/103720.289825:ERROR:base/memory/shared_memory_switch.cc:264] Mach rendezvous   β”‚
β”‚ failed, terminating process (parent died?)                                                                            β”‚
β”‚   [pid=68954][err] [68960:6456808:0510/103720.289832:ERROR:base/memory/shared_memory_switch.cc:264] Mach rendezvous   β”‚
β”‚ failed, terminating process (parent died?)                                                                            β”‚
β”‚                                                                                                                       β”‚
β”‚   Code context:                                                                                                       β”‚
β”‚   523           parsed_st = _extract_stack_trace_information_from_stack(st, is_internal)                              β”‚
β”‚   524           self._api_zone.set(parsed_st)                                                                         β”‚
β”‚   525           try:                                                                                                  β”‚
β”‚   526               return await cb()                                                                                 β”‚
β”‚   527           except Exception as error:                                                                            β”‚
β”‚   528 β†’             raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None                          β”‚
β”‚   529           finally:                                                                                              β”‚
β”‚   530               self._api_zone.set(None)                                                                          β”‚
β”‚   531                                                                                                                 β”‚
β”‚   532       def wrap_api_call_sync(                                                                                   β”‚
β”‚   533           self, cb: Callable[[], Any], is_internal: bool = False                                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

[ERROR]... Γ— ❌ Test failed: Failed to get content

tdiprima avatar May 10 '25 14:05 tdiprima

@tdiprima I'm changing the title of this issue since the command you mentioned works fine crwl crawl https://www.nbcnews.com/business -o markdown, just remove the comments after #

Regarding this other issue it's an error we are facing with other users as well. It's quite hard to reproduce. So I'll update here soon as we figure that out. Also try the latest version 0.6.3 and check if you still have this issue.

aravindkarnam avatar May 14 '25 07:05 aravindkarnam

@aravindkarnam Hey, just wanted to follow up. I tried the command again using version 0.6.3, and I'm still hitting the same issueβ€”only on macOS. Works fine on Ubuntu, so I'm guessing this might be platform-specific.

Not super pressed about it anymore since I've decided to move off Mac for this kind of work, but just wanted to flag it in case it helps others or future debugging. Appreciate the updates on the related bugβ€”good luck hunting it down.

tdiprima avatar May 14 '25 13:05 tdiprima

Same problem here. I'm also using macos.

after run craw4ai-doctor it shows

(crawl4ai) ➜  crawl4ai crawl4ai-doctor
[INIT].... β†’ Running Crawl4AI health check... 
[INIT].... β†’ Crawl4AI 0.6.3 
[TEST].... β„Ή Testing crawling capabilities... 
[ERROR]... Γ— https://crawl4ai.com                               | Error: Unexpected error in _crawl_web at line 558 in wrap_api_call 
(.venv/lib/python3.11/site-packages/playwright/_impl/_connection.py):
Error: BrowserContext.new_page: Target page, context or browser has been closed
Browser logs:

<launching> /Users/fanzhende/Library/Caches/ms-playwright/chromium-1179/chrome-mac/Chromium.app/Contents/MacOS/Chromium --disable-field-trial-config 
--disable-background-networking --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache 
--disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-component-update 
--no-default-browser-check --disable-default-apps --disable-dev-shm-usage --disable-extensions 
--disable-features=AcceptCHFrame,AutoExpandDetailsElement,AvoidUnnecessaryBeforeUnloadCheckSync,CertificateTransparencyComponentUpdater,DestroyProfileOnB
rowserClose,DialMediaRouteProvider,ExtensionManifestV2Disabled,GlobalMediaControls,HttpsUpgrades,ImprovedCookieControls,LazyFrameLoading,LensOverlay,Medi
aRouter,PaintHolding,ThirdPartyStoragePartitioning,Translate --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection 
--disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --force-color-profile=srgb --metrics-recording-only --no-first-run 
--password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --disable-search-engine-choice-screen 
--unsafely-disable-devtools-self-xss-warnings --enable-automation --enable-use-zoom-for-dsf=false --headless --hide-scrollbars --mute-audio 
--blink-settings=primaryHoverType=2,availableHoverTypes=2,primaryPointerType=4,availablePointerTypes=4 --no-sandbox --disable-gpu 
--disable-gpu-compositing --disable-software-rasterizer --no-sandbox --disable-dev-shm-usage --no-first-run --no-default-browser-check --disable-infobars
--window-position=0,0 --ignore-certificate-errors --ignore-certificate-errors-spki-list --disable-blink-features=AutomationControlled 
--window-position=400,0 --disable-renderer-backgrounding --disable-ipc-flooding-protection --force-color-profile=srgb --mute-audio 
--disable-background-timer-throttling --window-size=1280,720 --disable-background-networking --disable-backgrounding-occluded-windows --disable-breakpad 
--disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-extensions 
--disable-features=TranslateUI --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --metrics-recording-only 
--password-store=basic --use-mock-keychain --user-data-dir=/var/folders/2b/3_lqmflj4qx3nl8cf287333r0000gn/T/playwright_chromiumdev_profile-zx4ScD 
--remote-debugging-pipe --no-startup-window
<launched> pid=20743
 [0708/103215.865348⚠third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94] mach_vm_read(0x10231c000, 0x8000): (os/kern) invalid address 
(1)
 [0708/103215.865848⚠third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94] mach_vm_read(0x10231c000, 0x8000): (os/kern) invalid address 
(1)
 [0708/103215.866050⚠third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94] mach_vm_read(0x10231c000, 0x8000): (os/kern) invalid address 
(1)
 [0708/103215.866229⚠third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94] mach_vm_read(0x10231c000, 0x8000): (os/kern) invalid address 
(1)
 [0708/103215.866304⚠third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94] mach_vm_read(0x10231c000, 0x8000): (os/kern) invalid address 
(1)
 [0708/103215.866437⚠third_party/crashpad/crashpad/util/process/process_memory_mac.cc:94] mach_vm_read(0x10231c000, 0x8000): (os/kern) invalid address 
(1)

Code context:
 553           parsed_st = _extract_stack_trace_information_from_stack(st, is_internal, title)
 554           self._api_zone.set(parsed_st)
 555           try:
 556               return await cb()
 557           except Exception as error:
 558 β†’             raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
 559           finally:
 560               self._api_zone.set(None)
 561   
 562       def wrap_api_call_sync(
 563           self, cb: Callable[[], Any], is_internal: bool = False, title: str = None 
[ERROR]... Γ— ❌ Test failed: Failed to get content 

Fanzzzd avatar Jul 08 '25 02:07 Fanzzzd

I have checked the issues raised here, and all are working correctly in the latest version. Please ensure you are using the latest version (0.7.2).

I will close the issue now, but feel free to continue the discussion and ask any questions here.

ntohidi avatar Aug 04 '25 08:08 ntohidi