firecrawl icon indicating copy to clipboard operation
firecrawl copied to clipboard

[Feat] issue #1 exclude tags (html clean-up)

Open oliviermills opened this issue 10 months ago • 3 comments

Replaced the exclude tag list with a function that does nicer and safer clean up. Resolves #1 Added basics tests for the function.

Important: should add an integration test with a much larger variety of html pages see #15

oliviermills avatar Apr 18 '24 03:04 oliviermills

CI/CD is failing because we hit the llamaparse rate limit

rafaelsideguide avatar Apr 19 '24 13:04 rafaelsideguide

@rafaelsideguide Now that we have an initial testing framework, we should start testing these changes and get this merged.

also @oliviermills quick thing, I noticed that this pr was made before we switched to AGPL 3.0, can you just confirm that you agree to relicense your contributions under the new license? Once that's done, we can proceed to merging them!

(you can just write a comment here saying "I agree to relicense my contributions to the AGPL" - see https://github.com/mendableai/firecrawl/pull/134 for more context)

Thank you.

nickscamara avatar May 15 '24 01:05 nickscamara

I agree to relicense my contributions to the AGPL 🤗

oliviermills avatar May 15 '24 02:05 oliviermills

Closing this as #273 resolves this problem.

rafaelsideguide avatar Jun 14 '24 14:06 rafaelsideguide