deepcrawl
deepcrawl copied to clipboard
100% free and full open-source edge Firecrawl alternative with better links extraction for agents - that you can deploy to cloudflare or vercel by yourself.
Deepcrawl
WARNING: DO NOT USE DEEPCRAWL IN PRODUCTION RIGHT NOW AS IT IS SUBJECT TO CHANGE AND STILL UNDER RAPID DEVELOPMENT. USE AT YOUR OWN RISK!
100% free and open-source Firecrawl alternative with better performance and flexibility.
Ask DeepWiki about this repo

NOTE: DeepCrawl doesn’t target anti-scraping or anti-bot purposes. It’s optimized for high‑frequency agent workloads that scrape public pages to extract cleaned Markdown and a hierarchical links tree.
Deepcrawl is an agent-oriented website data context extraction platform. It extracts cleaned markdown of page content, agent-favoured hierarchical links tree, and metadata that LLMs can digest with minimal token cost to reduce context switching and hallucination.
Full Platform (Nextjs Dashboard, API Workers, Auth Workers, and Database) is open and transparent.
Documentation
Visit https://deepcrawl.dev/docs to view the documentation.
Contributing
Please read the contributing guide.
License
Open Source. Open Code - built with ❤️ by @felixLu.