archives
archives copied to clipboard
Common Crawl
https://commoncrawl.org/
We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone.
I'm not sure how much data it is, but certainly a few TB.