crawl4ai Bug Report for Crawl4A multiple async

Hi UncleCode, I hope you are doing well!

First of all, I want to express my gratitude for creating Crawl4AI It’s a fantastic tool for what I’m exploring

I did come across a small bug that I wanted to bring to your attention. When I try to run the scraper with LLMs in concurrency, the output format doesn’t seem to align with the Pydantic schema, and it crashes.

This only happens when I’m running it with concurrency and combining it with other async scrapers. the output schema turns into index,tags and content

Oct 07 '24 16:10 jmontoyavallejo

Hello @jmontoyavallejo , thank you so much for your kind words. I would greatly appreciate it if you could provide a code sample that I can run and replicate the error I'm facing. What you're saying sounds interesting. Please share a sample code that demonstrates the issue when making concurrent requests to multiple URLs using the LLM. Thx

Oct 08 '24 10:10 unclecode

hi @unclecode was this bug fixed? I think the issue still persists if i am not wrong. I can get the desired output when I use crawler.arun inside a for loop but givel me html tags when i use arun_many. attaching a sample code and the output.

output.txt

crawler_arun_many.txt

Sorry if this bug was fixed and I am somehow implementing this wrong. Thanks. I really appreciate the effort you have given to create and maintain crawl4ai.

Jan 24 '25 13:01 saurabhj9

@saurabhj9 Thanks for sharing your code sample. Will you try your code again with the recent beta version? The way you work with the function arun_many() has changed. Here, I'll give you the links to the relevant part of the documentation and explain how to work with it. Most likely, the issue has already been resolved. Let me know if it hasn't.

https://docs.crawl4ai.com/advanced/multi-url-crawling/

Jan 25 '25 11:01 unclecode