firecrawl
firecrawl copied to clipboard
[Feat] Improved includeMain content
#p3nnywh1stl3 on the discord had a great suggestion for the tags to exclude to get tidy content from a website:
["script", "style", "nav", "header", "footer", ".advertisement", ".sidebar", ".nav", ".menu", "#comments", "img", "a"]
It would be interesting if we added these tags to the onlyMainContent in V1