firecrawl
firecrawl copied to clipboard
[Feat] issue #1 exclude tags (html clean-up)
Replaced the exclude tag list with a function that does nicer and safer clean up. Resolves #1 Added basics tests for the function.
Important: should add an integration test with a much larger variety of html pages see #15
CI/CD is failing because we hit the llamaparse rate limit
@rafaelsideguide Now that we have an initial testing framework, we should start testing these changes and get this merged.
also @oliviermills quick thing, I noticed that this pr was made before we switched to AGPL 3.0, can you just confirm that you agree to relicense your contributions under the new license? Once that's done, we can proceed to merging them!
(you can just write a comment here saying "I agree to relicense my contributions to the AGPL" - see https://github.com/mendableai/firecrawl/pull/134 for more context)
Thank you.
I agree to relicense my contributions to the AGPL 🤗
Closing this as #273 resolves this problem.