webkit2png icon indicating copy to clipboard operation
webkit2png copied to clipboard

Add --cookies option to pass in a cookie file

Open brewingcode opened this issue 12 years ago • 17 comments

This is a re-pullrequest of #31, but against the most recent master (since I'm a bit more up to speed with git now)

Use a browser extension such as:

https://chrome.google.com/webstore/detail/cookietxt-export/lopabhfecdfhgogdbojmaicoicjekelh

...to dump cookies into a file, and then pass in the filename with "--cookies FILENAME". You can also copy the cookie file content, and then do "--cookies <(pbpaste)" to skip the intermediate file.

brewingcode avatar Aug 13 '13 02:08 brewingcode

I'd love to see this implemented without the need for temporary files.

RSully avatar Aug 13 '13 12:08 RSully

RSully, I agree, but I didn't see a way for cookielib to load a set of cookies via anything except a filename.

If there is some other builtin set of libraries besides cookielib + urllib2 that can build a cookie header for me, I'd be happy to switch. I'm not very familiar with the various modules Python comes with.

brewingcode avatar Aug 13 '13 17:08 brewingcode

I don't like the cookie file as a user interface - ideally we'd accept cookie name/value pairs on the command line and use that to generate a cookie header manually (if urllib2 or cookielib can't do it then the format isn't that hard) then pass that to req.setValue_forHTTPHeaderField_().

Also, is req.setValue_forHTTPHeaderField_() enough? Will cookies sent that way also show up in the javascript document.cookie object? If not, is that a problem?

paulhammond avatar Aug 14 '13 23:08 paulhammond

Fair enough. I've renamed the arguments to allow for both:

    --cookie-file=FILENAME
                        specify a Netscape cookie file
    --cookie=NAME=VALUE
                        specify a cookie name-value pair (multiple --cookie is
                        allowed)

I find the cookie file to be far more useful for me: I'm scraping my development Wordpress blog (requires user auth), and I do not want to mess around assembling all the cookies that are required to authenticate with Wordpress on the command line. It's far faster to simply dump the cookies out of my browser and then feed that directly into cookielib. I was following the lead of curl, which allows a Netscape cookie file with the -b option.

I'm not sure about req.setValue_forHTTPHeaderField_() setting the cookies in such a way that javascript can see them...it has not mattered for my purposes.

brewingcode avatar Aug 15 '13 00:08 brewingcode

RSully, if you use --cookie, then no temporary file be created, and none of the libraries I used to process the cookiefile will even be loaded:

  • cookielib
  • urllib2
  • tempfile
  • os

brewingcode avatar Aug 15 '13 00:08 brewingcode

@aperlscript Thanks for the updated code - having both as an option works for me. I just left one question as a code comment...

Also, before merging this I'd like to understand how webkit2png interacts with Safari's cookie jar. Right now it appears to use Safari's cookies, which feels like a bad thing to me, but I haven't had time to investigate why.

paulhammond avatar Aug 15 '13 02:08 paulhammond

(Also, don't worry about the part where this pull request will no longer automatically merge, I'll deal with that when it's ready)

paulhammond avatar Aug 15 '13 02:08 paulhammond

I hadn't realized that this already used Safari's cookiejar - I didn't notice this during my testing. Right now to get around any auth issues I have been saving pages as webarchives and running webkit2png against that.

RSully avatar Aug 15 '13 12:08 RSully

I don't think this touches Safari's cookies: if I pass in cookies for auth for webkit2png, and then go into Safari's cookies, I don't see any entries for the domain that used my auth cookies.

image

brewingcode avatar Aug 15 '13 18:08 brewingcode

any plans to pull this in?

jgallen23 avatar Oct 31 '13 19:10 jgallen23

Merged in the latest paulhammond/master, which necessitated a couple more minor changes.

brewingcode avatar Oct 31 '13 19:10 brewingcode

Similarly, I'd love to have a means to tell it not to use Safari's cookie jar. --no-cookies

Saeven avatar Jan 28 '14 19:01 Saeven

I'm embarrassed to admit how long it took me to realize that NSURLRequest uses Safari's cookies by default, independent of the code that this pull request is adding. Consider me an idiot.

I've added an option to suppress this default behavior in the request object by simply setting Cookie to an empty string. I didn't see a way to remove the header altogether.

brewingcode avatar Jan 28 '14 19:01 brewingcode

webkit2png doesn't use Safari's cookies for me. Maybe it has stopped working at some point?

raine avatar Jan 13 '16 14:01 raine

I tried --cookie-file with output from the Chrome extension and it sent empty string as cookie string.

--cookie=FOO=BAR works though.

raine avatar Jan 13 '16 15:01 raine

@raine see issue #94. Previously webkit2png used Safari's cookies. I'm not sure what changed, perhaps Safari's sandboxing.

RSully avatar Jan 13 '16 18:01 RSully

For anyone interested, in my fork raine/master, I have this change and I added option --cookie-raw to allow setting the raw value of the Cookie header. I added it because it's easier to use with Chrome's "Copy as cURL" feature.

raine avatar Jan 14 '16 09:01 raine