crawl4ai
crawl4ai copied to clipboard
AsyncWebCrawler returns arrays of JSON objects instead of single objects per scrape
trafficstars
Description
The AsyncWebCrawler is currently returning arrays of JSON objects for each scrape, even when a Pydantic schema and prompt are specified to return only one JSON object per scrape. This behavior is causing issues in our data processing pipeline and needs to be addressed.
Current Behavior
- The
AsyncWebCrawlerreturns an array of JSON objects for each scraped page. - This occurs even when a Pydantic schema is provided to define the structure of a single object.
- The prompt given to the crawler also specifies that only one JSON object should be returned per scrape.
Expected Behavior
- The
AsyncWebCrawlershould return a single JSON object for each scraped page. - The returned object should conform to the provided Pydantic schema.
- The crawler should respect the prompt that specifies returning only one JSON object per scrape.
Steps to Reproduce
- Set up an
AsyncWebCrawlerinstance with a specified Pydantic schema. - Provide a prompt that clearly states to return a single JSON object.
- Perform a scrape operation on a target URL.
- Observe that the returned result is an array of JSON objects instead of a single object.
Can you show me a sample of the code you're running? I'm currently testing it and I'd appreciate it if you shared your code so I can review it.
@Udbhav8 Closing this issue due to inactivity. Please reopen it as new issue if the problem still exists.