Feature Request: Don't force http/s for websites. Crawl4ai supports file:// for local html.
title
Makes sense! Added this to the board
I got this partially working by adding file:// to the supported schemes. I also needed to mount a local volume to the server Docker container where it can read local files.
I can get it to crawl a top level local file, but it returns local links without a file scheme and fails to crawl further.
I’m working on creating a small example with crawl4ai, to help figure out where to make the changes to the crawling_service.py.
significant use case: consider repos such as azure-docs
when a "docs" repo exists, you can git pull outside of archon, then crawl your local filesystem. and if the docs change, git pull local fs -> recrawl in archon.