crawl4ai
crawl4ai copied to clipboard
[Bug]: Unexpected error in _crawl_web
crawl4ai version
0.7.7
Expected Behavior
Should parse the webpage correctly.
Current Behavior
When crawling this page: https://www.toshiba-lifestyle.com/th-en/blog/how-to-choose-the-right-laundry-product-for-you
I get the following error:
[ERROR]... × https://www.toshiba-lif...laundry-product-for-you | Error:
Unexpected error in _crawl_web at line 493 in aprocess_html
(../usr/local/lib/python3.12/site-packages/crawl4ai/async_webcrawler.py):
Error: Process HTML, Failed to extract content from the website:
https://www.toshiba-lifestyle.com/th-en/blog/how-to-choose-the-right-laundry-pro
duct-for-you, error: 1 validation error for MediaItem
width
Input should be a valid integer, unable to parse string as an integer
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Code context:
488 )
489
490 except InvalidCSSSelectorError as e:
491 raise ValueError(str(e))
492 except Exception as e:
493 → raise ValueError(
494 f"Process HTML, Failed to extract content from the
website: {url}, error: {str(e)}"
495 )
496
497 # Extract results - handle both dict and ScrapingResult
498 if isinstance(result, dict):
Seems like something is strange in their source code, causing the issue.