[Bug]: result.link (links extraction results empty lists) not working when using raw_html_url = f"raw:{raw_html}" as input
crawl4ai version
0.4.247
Expected Behavior
page should have some links to other pages, which it should return in result.link
Current Behavior
all_links = result.links.get("internal", []) + result.links.get("external", []) # always empty
when doing
raw_html_url = f"raw:{raw_html}"
async with C4AIAsyncWebCrawler(config=browser_config) as crawler: result = await crawler.arun(raw_html_url, config=crawler_config)
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
macOS
Python version
3.11.9
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response
@mllife Can you give some samples for raw_html that's causing the crawler to return empty links in the result.
all of them, i generated the html using crawler for this page "https://www.bankofcanada.ca/press/"; if i read it from drive again i am getting empty result.links
it's fixed and in the latest release (0.7.4)