ArchiveBot icon indicating copy to clipboard operation
ArchiveBot copied to clipboard

Send `Cookie: over18=1` to all reddit URLs

Open ivan opened this issue 9 years ago • 7 comments

On all requests that match https?://[^/]+\.reddit\.com(/|$), we should send a Cookie: over18=1 header so that we always get the content instead of the age wall.

Perhaps something for the new pipeline/archivebot/wpull/plugin.py?

ivan avatar Feb 01 '15 11:02 ivan

Implementation note: pipeline.py (15ae3ca6a6831f2b1ae366a58d5620474f5b3d2c) already adds this cookie for the top level URL.

chfoo avatar Mar 18 '15 03:03 chfoo

Yeah, it would be better if this worked for any crawl that includes a reddit URL, not just those that start with a reddit URL.

ivan avatar Mar 18 '15 05:03 ivan

Similarly, send Cookie: NCR=1 to all *.blogspot.com URLs

ivan avatar May 16 '15 16:05 ivan

Basically, we should create a cookie jar for these and use wpull's --load-cookies option.

JustAnotherArchivist avatar Dec 25 '18 00:12 JustAnotherArchivist

Also _options=%7B%22pref_quarantine_optin%22%3A%20true%7D on Reddit to get around the quarantine blocks.

JustAnotherArchivist avatar Apr 28 '19 01:04 JustAnotherArchivist

Related: #416

JustAnotherArchivist avatar Jun 06 '20 23:06 JustAnotherArchivist

Some FC2 blogs are age-gated and require an age_check=1 cookie. Should be sent to all blog\d*\.fc2\.com and blog\d*\.fc2blog\.us subdomains; it's set for the particular blog shard(?) domain when you click on the corresponding button on the age gate.

JustAnotherArchivist avatar Sep 07 '20 03:09 JustAnotherArchivist