reader
reader copied to clipboard
Does Jina.ai scrape the websites anonymously or non-anonymously?
Hi Team,
Wanted to understand if Jina.ai scrape the websites anonymously or non-anonymously for a use-case for my company. What if we have legal approval from website owners to scrape their websites, does in that case does Jina.ai announces who is it scraping on behalf of?
Hi @deathofabat.
Reader scrapes the website using a headless Chrome browser, and with a respective Chrome browser UA.
You can customize this UA, though, using x-user-agent header.
In addition to this, we recently added an option x-robots-txt to check-and-fail robots.txt of websites, ensuring scraping of the page is not explicitly prohibited by the site owner.