crawl4ai
crawl4ai copied to clipboard
[Bug]: Random user agent not closing browser?
crawl4ai version
0.6.1
Expected Behavior
having issues with user_agent_mode = "random". Here is what I have in the playground for the URL example.com. I'm starting to think I can't or shouldn't use random mode when using a profile I have mounted. I used process of elimination to narrow it down to user_agent_mode.
BrowserConfig( headless=True, use_managed_browser=True, user_data_dir="/persistent_profile", user_agent_mode="random", extra_args=[ "--no-sandbox", "--disable-gpu", ], )
Current Behavior
Lock file error 025-05-02 00:25:25,633 INFO reaped unknown pid 696 (exit status 0) 2025-05-02 00:25:25,662 INFO reaped unknown pid 698 (exit status 0) 2025-05-02 00:25:25,663 INFO reaped unknown pid 701 (exit status 0) 2025-05-02 00:25:25,663 INFO reaped unknown pid 702 (exit status 0) [ERROR]... × Browser process terminated during startup | Code: 21 | STDOUT: | STDERR: [694:694:0502/002525.328204:ERROR:process_singleton_posix.cc(340)] Failed to create /persistent_profile/SingletonLock: File exists (17) [694:694:0502/002525.329247:ERROR:chrome_main_delegate.cc(530)] Failed to create a ProcessSingleton for your profile directory. This means that running multiple instances would start multiple browser processes rather than opening a new window in the existing process. Aborting now to avoid profile corruption. [701:701:0100/000000.350396:ERROR:zygote_linux.cc(664)] write: Broken pipe (32)
[INIT].... → Crawl4AI 0.6.1 [FETCH]... ↓ https://example.com | ✓ | ⏱: 0.71s [SCRAPE].. ◆ https://example.com | ✓ | ⏱: 0.01s [COMPLETE] ● https://example.com | ✓ | ⏱: 0.73s 2025-05-02 00:25:28,786 - api - INFO - Memory usage: Start: 104.0546875 MB, End: 106.41015625 MB, Delta: 2.35546875 MB, Peak: 106.41015625 MB
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
Linux
Python version
3.10.12
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response
Hmm, I got the error again but I haven't been running with the user agent mode = random anymore. ALmost an 30 minutes passed if my math is right before I ran another crawl. here is the outlog log from the
[COMPLETE] ● https://www.example.com | ✓ | ⏱: 0.72s
2025-05-02 01:42:29,458 - api - INFO - Memory usage: Start: 128.53125 MB, End: 129.6640625 MB, Delta: 1.1328125 MB, Peak: 129.6640625 MB
2025-05-02 02:16:06,834 INFO reaped unknown pid 2022 (exit status 0)
2025-05-02 02:16:06,835 INFO reaped unknown pid 2024 (exit status 0)
2025-05-02 02:16:06,835 INFO reaped unknown pid 2027 (exit status 0)
2025-05-02 02:16:06,835 INFO reaped unknown pid 2028 (exit status 0)
[ERROR]... × Browser process terminated during startup | Code: 21 | STDOUT: | STDERR: [2020:2020:0502/021606.764391:ERROR:process_singleton_posix.cc(340)] Failed to create /persistent_profile/SingletonLock: File exists (17)
[2020:2020:0502/021606.770834:ERROR:chrome_main_delegate.cc(530)] Failed to create a ProcessSingleton for your profile directory. This means that running multiple instances would start multiple browser processes rather than opening a new window in the existing process. Aborting now to avoid profile corruption.
[INIT].... → Crawl4AI 0.6.1
[FETCH]... ↓ https://www.example.com | ✓ | ⏱: 0.83s
[SCRAPE].. ◆ https://www.example.com | ✓ | ⏱: 0.02s
[COMPLETE] ● https://www.example.com | ✓ | ⏱: 0.86s
2025-05-02 02:16:10,417 - api - INFO - Memory usage: Start: 119.6328125 MB, End: 120.83203125 MB, Delta: 1.19921875 MB, Peak: 120.83203125 MB
There is also this a few minutes before all that.
lta: 1.01953125 MB, Peak: 128.91015625 MB
2025-05-02 01:34:49,325 INFO reaped unknown pid 42 (exit status 0)
2025-05-02 01:34:49,325 INFO reaped unknown pid 44 (exit status 0)
2025-05-02 01:34:49,326 INFO reaped unknown pid 47 (exit status 0)
2025-05-02 01:34:49,326 INFO reaped unknown pid 48 (exit status 0)
[FETCH]... ↓ https://www.example.com