crawl4ai
crawl4ai copied to clipboard
[Bug]: Cannot handle dynamic content
crawl4ai version
1.6.3
Expected Behavior
return dynamic content
Current Behavior
get an "Error: list index out of range", when re-use the same crawler session
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
import asyncio
from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
async def main():
# Step 1: Load initial Hacker News page
config = CrawlerRunConfig(
wait_for="css:.athing:nth-child(30)" # Wait for 30 items
)
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url="https://news.ycombinator.com",
config=config
)
print("Initial items loaded.")
# Step 2: Let's scroll and click the "More" link
load_more_js = [
"window.scrollTo(0, document.body.scrollHeight);",
# The "More" link at page bottom
"document.querySelector('a.morelink')?.click();"
]
next_page_conf = CrawlerRunConfig(
js_code=load_more_js,
wait_for="""js:() => {
return document.querySelectorAll('.athing').length > 30;
}""",
# Mark that we do not re-navigate, but run JS in the same session:
js_only=True,
session_id="hn_session"
)
# Re-use the same crawler session
result2 = await crawler.arun(
url="https://news.ycombinator.com", # same URL but continuing session
config=next_page_conf
)
total_items = result2.cleaned_html.count("athing")
print("Items after load-more:", total_items)
if __name__ == "__main__":
asyncio.run(main())
OS
windows 10
Python version
3.13
Browser
Chromium
Browser version
136.0.7103.25