crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

AsyncWebCrawler is not re-using localstorage

Open Jevli opened this issue 1 year ago • 0 comments

It look like when using AsyncPlaywrightCrawlerStrategy with AsyncWebCrawler actions what are done in on_browser_created hook doesn't follow to crawler.arun. And I mean browser localstorage.

I found website which store token and refreshToken to browser localstorage when log in to page. But when I'm configuring on_browser_created hook to take care of log in AsyncWebCrawler doesn't reuse localstorage.

I follow code to async_crawler_strategy.py and it seems that when accessing context in use_persistent_context only Browser is collected and not Browser context from Browser.contexts list. Also I didn't find out how to tell strategy to use hole context.

Is here oversight that you can't reuse localstorage (from on_browser_created hook)? Or might here be bug/missing feauture?

I can't find way to give storageState to crawler strategy or crawler so it could use cookie/localstorage cache from there?

Jevli avatar Dec 05 '24 20:12 Jevli