firecrawl
firecrawl copied to clipboard
Sometimes commas are not preserved, and elements of a list get concatenated.
I'm scraping https://huggingface.co/docs/transformers/model_doc/mistral
Here's my code:
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)
def run_firecrawl_scrape(url: str):
scrape_result = app.scrape_url(url, params={'formats': ['markdown']})
return scrape_result
sample_url = "https://huggingface.co/docs/transformers/model_doc/mistral"
sample_crawl = run_firecrawl_scrape(sample_url)
print(sample_crawl["markdown"])
And i'm seeing that elements in a comma-separated list are unwantedly concatenated. Here's what it looks like on the website:
And here's the returned markdown: