crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: Random user agent not closing browser?

Open digiSal opened this issue 7 months ago • 1 comments

crawl4ai version

0.6.1

Expected Behavior

having issues with user_agent_mode = "random". Here is what I have in the playground for the URL example.com. I'm starting to think I can't or shouldn't use random mode when using a profile I have mounted. I used process of elimination to narrow it down to user_agent_mode.

BrowserConfig( headless=True, use_managed_browser=True, user_data_dir="/persistent_profile", user_agent_mode="random", extra_args=[ "--no-sandbox", "--disable-gpu", ], )

Current Behavior

Lock file error 025-05-02 00:25:25,633 INFO reaped unknown pid 696 (exit status 0) 2025-05-02 00:25:25,662 INFO reaped unknown pid 698 (exit status 0) 2025-05-02 00:25:25,663 INFO reaped unknown pid 701 (exit status 0) 2025-05-02 00:25:25,663 INFO reaped unknown pid 702 (exit status 0) [ERROR]... × Browser process terminated during startup | Code: 21 | STDOUT: | STDERR: [694:694:0502/002525.328204:ERROR:process_singleton_posix.cc(340)] Failed to create /persistent_profile/SingletonLock: File exists (17) [694:694:0502/002525.329247:ERROR:chrome_main_delegate.cc(530)] Failed to create a ProcessSingleton for your profile directory. This means that running multiple instances would start multiple browser processes rather than opening a new window in the existing process. Aborting now to avoid profile corruption. [701:701:0100/000000.350396:ERROR:zygote_linux.cc(664)] write: Broken pipe (32)

[INIT].... → Crawl4AI 0.6.1 [FETCH]... ↓ https://example.com | ✓ | ⏱: 0.71s [SCRAPE].. ◆ https://example.com | ✓ | ⏱: 0.01s [COMPLETE] ● https://example.com | ✓ | ⏱: 0.73s 2025-05-02 00:25:28,786 - api - INFO - Memory usage: Start: 104.0546875 MB, End: 106.41015625 MB, Delta: 2.35546875 MB, Peak: 106.41015625 MB

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce


Code snippets


OS

Linux

Python version

3.10.12

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

digiSal avatar May 02 '25 01:05 digiSal

Hmm, I got the error again but I haven't been running with the user agent mode = random anymore. ALmost an 30 minutes passed if my math is right before I ran another crawl. here is the outlog log from the

[COMPLETE] ● https://www.example.com                                                                              | ✓ | ⏱: 0.72s
2025-05-02 01:42:29,458 - api - INFO - Memory usage: Start: 128.53125 MB, End: 129.6640625 MB, Delta: 1.1328125 MB, Peak: 129.6640625 MB
2025-05-02 02:16:06,834 INFO reaped unknown pid 2022 (exit status 0)
2025-05-02 02:16:06,835 INFO reaped unknown pid 2024 (exit status 0)
2025-05-02 02:16:06,835 INFO reaped unknown pid 2027 (exit status 0)
2025-05-02 02:16:06,835 INFO reaped unknown pid 2028 (exit status 0)
[ERROR]... × Browser process terminated during startup | Code: 21 | STDOUT:  | STDERR: [2020:2020:0502/021606.764391:ERROR:process_singleton_posix.cc(340)] Failed to create /persistent_profile/SingletonLock: File exists (17)
[2020:2020:0502/021606.770834:ERROR:chrome_main_delegate.cc(530)] Failed to create a ProcessSingleton for your profile directory. This means that running multiple instances would start multiple browser processes rather than opening a new window in the existing process. Aborting now to avoid profile corruption.

[INIT].... → Crawl4AI 0.6.1
[FETCH]... ↓ https://www.example.com                                                                              | ✓ | ⏱: 0.83s
[SCRAPE].. ◆ https://www.example.com                                                                              | ✓ | ⏱: 0.02s
[COMPLETE] ● https://www.example.com                                                                              | ✓ | ⏱: 0.86s
2025-05-02 02:16:10,417 - api - INFO - Memory usage: Start: 119.6328125 MB, End: 120.83203125 MB, Delta: 1.19921875 MB, Peak: 120.83203125 MB

There is also this a few minutes before all that.

lta: 1.01953125 MB, Peak: 128.91015625 MB
2025-05-02 01:34:49,325 INFO reaped unknown pid 42 (exit status 0)
2025-05-02 01:34:49,325 INFO reaped unknown pid 44 (exit status 0)
2025-05-02 01:34:49,326 INFO reaped unknown pid 47 (exit status 0)
2025-05-02 01:34:49,326 INFO reaped unknown pid 48 (exit status 0)
[FETCH]... ↓ https://www.example.com    

digiSal avatar May 02 '25 02:05 digiSal