crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

Language Support

Open oaishi opened this issue 1 year ago • 2 comments
trafficstars

Hi,

Thanks for the great repository. I am new to this repository, I was curious to know if there is any support to change the language before I crawl a certain page?

oaishi avatar Oct 01 '24 23:10 oaishi

Thank you for your interest in language support! While browsers don't directly support changing the language of web content, our library does support setting the Accept-Language header, which many websites use to serve content in different languages.

You can set the language preference in a few ways:

  1. When creating the crawler:

    crawler = AsyncWebCrawler(
        crawler_strategy=AsyncPlaywrightCrawlerStrategy(
            headers={"Accept-Language": "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7"}
        )
    )
    
  2. Before crawling:

    crawler.crawler_strategy.headers["Accept-Language"] = "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7"
    
  3. When calling the arun method:

    result = await crawler.arun(
        url,
        headers={"Accept-Language": "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7"}
    )
    

Please note that the effectiveness of this method depends on the website you're crawling and whether it supports serving content in different languages based on the Accept-Language header.

We're also considering adding more language-related features in future updates. Could you provide more details about your specific use case? This would help us prioritize the most useful approaches for our users.

unclecode avatar Oct 02 '24 07:10 unclecode

Thanks so much @unclecode for the suggestion. I will check this out and let you know incase I have any followup questions.

oaishi avatar Oct 07 '24 04:10 oaishi