reader icon indicating copy to clipboard operation
reader copied to clipboard

Does Jina.ai scrape the websites anonymously or non-anonymously?

Open deathofabat opened this issue 1 year ago • 1 comments

Hi Team,

Wanted to understand if Jina.ai scrape the websites anonymously or non-anonymously for a use-case for my company. What if we have legal approval from website owners to scrape their websites, does in that case does Jina.ai announces who is it scraping on behalf of?

deathofabat avatar Feb 24 '25 16:02 deathofabat

Hi @deathofabat.

Reader scrapes the website using a headless Chrome browser, and with a respective Chrome browser UA.

You can customize this UA, though, using x-user-agent header. In addition to this, we recently added an option x-robots-txt to check-and-fail robots.txt of websites, ensuring scraping of the page is not explicitly prohibited by the site owner.

nomagick avatar Mar 13 '25 09:03 nomagick