Tessa Walsh
Tessa Walsh
Hi @tuehlarsen - we use Brave Browser for crawling, which has some ad blocking features enabled. To disable these, you can create a browser profile, go to `brave://settings`, and then...
Hi @tuehlarsen , you can create and edit browser profiles directly in the GUI! Check out this section of the documentation: https://docs.browsertrix.cloud/user-guide/browser-profiles/
Based on some conversations with our collaborators at Ouinet, this issue may be an easier way to implement de-duplication than #1372 (though we want to eventually support both), that could...
Renamed slightly to avoid confusion with https://github.com/webrecorder/browsertrix-cloud/issues/890
Hi Ed, this is really cool! Thanks for sharing :) > >* it would be nice for the script to have some options to limit what was copied >* maybe...
> Thanks @tw4l! I thought maybe rclone could be used programmatically to pull the set of signed URLs instead of a bucket, and then it could write to the many...
Need to define what we want to do here and for what aims. Should this be a test suite, a benchmarking suite, or something in between? Descheduling until we have...
Should figure out encryption prior to merging/deploying this, so that we don't have an unencrypted copy of our database in an S3 bucket.
Hi @tuehlarsen, it's something we hope to get to before too long, but we don't currently have it roadmapped for a particular release. We will update the issue when it...
@tuehlarsen You'll be happy to know that @vnznznz is working on our first implementation of crawling through SOCKS5 proxies now. At first stage it will likely just be country-specific, but...