cli icon indicating copy to clipboard operation
cli copied to clipboard

Cookies not correctly updated within a --follow redirect chain

Open northys opened this issue 6 years ago ā€¢ 17 comments

Hello,

when site creates new session for user, the PHPSESSID is not overrided but both the old and the new one is sent. Please notice Cookie header in last request. I'm using httpie version 0.9.8.

Ā» http --verbose --session=/tmp/yamaha.json --form --follow POST https://www.yamaha-extranet.com/login/index [email protected] password=bar submitform=Submit
POST /login/index HTTP/1.1
Cookie: PHPSESSID=mdslhb7u0giujsaf8itq2gm2p0
Host: www.yamaha-extranet.com
User-Agent: HTTPie/0.9.8

email=****&submitform=Submit

HTTP/1.1 302 Found
Location: /
Set-Cookie: PHPSESSID=nb7qnhkfjsdtpe8gtj797koaq7; path=/; domain=www.yamaha-extranet.com; secure
X-Powered-By: PHP/5.5.38



GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=mdslhb7u0giujsaf8itq2gm2p0; PHPSESSID=nb7qnhkfjsdtpe8gtj797koaq7
Host: www.yamaha-extranet.com
User-Agent: HTTPie/0.9.8

Session file contains only one PHPSESSID though.

{
    "__meta__": {
        "about": "HTTPie session file",
        "help": "https://httpie.org/docs#sessions",
        "httpie": "0.9.8"
    },
    "auth": {
        "password": null,
        "type": null,
        "username": null
    },
    "cookies": {
        "PHPSESSID": {
            "expires": null,
            "path": "/",
            "secure": true,
            "value": "nb7qnhkfjsdtpe8gtj797koaq7"
        }
    },
    "headers": {}
}

northys avatar Mar 14 '18 14:03 northys

Is this issue up for grab? If so, I would like to give this an attempt

asifmallik avatar Jun 09 '20 01:06 asifmallik

@asifmallik yes, it is!

Here you can learn all about how sessions in HTTPie work:

  • https://httpie.org/docs#sessions
  • https://github.com/jakubroztocil/httpie/blob/master/httpie/sessions.py
  • https://github.com/jakubroztocil/httpie/blob/master/tests/test_sessions.py

Iā€˜d definitely start by writing a few test cases reproducing the issue.

My hunch is that the cookie assignment and saving of the session needs to happen after each individual request in the chain inside the loop; not only at the end this like we currently do:

https://github.com/jakubroztocil/httpie/blob/7ee519ef46ed0a4280f431d0c14745ed8244621f/httpie/client.py#L110-L113

jkbrzt avatar Jun 09 '20 04:06 jkbrzt

Thank you so much! I have looked into the test suite and there's a quick question I had about it - from what I understand, for testing, httpie uses httpbin. I imagine to write a test for this issue in particular, there would need to be some way of combining /redirect-to with /cookies/set. From what I read so far from httpbin.org, this does not seem possible. Do you know whether there is any workaround?

asifmallik avatar Jun 09 '20 05:06 asifmallik

@asifmallik luckily, /cookies/set also redirects. So this is how you can reproduce it:

1. prepare a session with a cookie

$ cat test-session.json
{
    "cookies": {
        "FOO": {
            "value": "BAR"
        }
    }
}

2. call /cookies/set

$ http --follow --all --print=H --session=./test-session.json httpbin.org/cookies/set?FOO=BAZ
GET /cookies/set?FOO=BAZ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR
Host: httpbin.org
User-Agent: HTTPie/2.1.0

GET /cookies HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR; FOO=BAZ
Host: httpbin.org
User-Agent: HTTPie/2.1.0

It should also handle and have tests for --session-readonly (cookies should be assigned after each request, but the session file shoud not be updated).

jkbrzt avatar Jun 09 '20 05:06 jkbrzt

Related: #824

jkbrzt avatar Jun 09 '20 13:06 jkbrzt

@jakubroztocil After a bit of tinkering, I think I have identified where the bug happens. It seems that httpie correctly replaces the cookie but only when outputting the headers for the redirected page, both the cookies are shown for some reason. To illustrate this, we can run:

  1. http --follow --all --session=./test-session.json httpbin.org/cookies/set?FOO=BAR --print=Hb which outputs:
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="/cookies">/cookies</a>.  If not click the link.

GET /cookies HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

{
    "cookies": {
        "FOO": "BAR"
    }
} 

  1. http --follow --all --session=./test-session.json httpbin.org/cookies/set?FOO=BAZ --print=Hb which outputs:
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="/cookies">/cookies</a>.  If not click the link.

GET /cookies HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR; FOO=BAZ
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

{
    "cookies": {
        "FOO": "BAZ"
    }
}

  1. http --follow --all --session=./test-session.json httpbin.org/get --print=Hb which outputs:
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAZ
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

{
    "args": {},
    "headers": {
        "Accept": "*/*",
        "Accept-Encoding": "gzip, deflate",
        "Cookie": "FOO=BAZ",
        "Host": "httpbin.org",
        "User-Agent": "HTTPie/2.2.0-dev",
        "X-Amzn-Trace-Id": "XXX"
    },
    "origin": "XXX",
    "url": "http://httpbin.org/get"
} 

Clearly, there is a discrepancy between what is output in 2 as a header response and what is actually saved as a header response which is reflected in 3. I have looked into the StrCLIResponse class which does not really seem to parse the header outputs from httpie. My understanding is that up until this point, there hasn't been any discrepancy between what is saved (hence reflected by a query to /cookies/get and what is actually output as header) so there was no need to parse this.

This also explains why the bug is not caught by the test test_session_update which does the above but it only checks the cookies through httpbin.org/get. So, I think the correct way to test this would be to:

  1. Extend StrCLIResponse to parse the header output from the CLI response
  2. For every header related test which involves querying httpbin.org/get to check the header should be complemented by adding a new assertion to also check the header output in the CLI wherever appropriate to expose possible discrepancy elsewhere.

So after this modification, this bug would be essentially caught by a modified test_session_update.

Am I understanding this issue correctly? Do you think this is a reasonable way to approach this or is there an interface already to access the header output from the CLI response?

asifmallik avatar Jun 11 '20 13:06 asifmallik

Actually it might make more sense to add a header property to BaseCLIResponse instead of string one as this is not specific to string responses

asifmallik avatar Jun 11 '20 13:06 asifmallik

@asifmallik it might be a bug in requests after all:

import requests


response = requests.get('http://httpbin.org/cookies/set?A=BBB', cookies={'A': 'AAA'})

print(response.json())  
#  {'cookies': {'A': 'BBB'}}  

print(response.request.headers['Cookie'])  
# A=AAA; A=BBB  

jkbrzt avatar Jun 11 '20 15:06 jkbrzt

@jakubroztocil I see, would you like to open an issue on requests or, should I?

asifmallik avatar Jun 11 '20 15:06 asifmallik

@asifmallik feel free to open one, but it would be good to poke around in requestsā€™s source code first for a bit to make sure we are not missing something.

jkbrzt avatar Jun 11 '20 15:06 jkbrzt

@jakubroztocil I see thank you!

asifmallik avatar Jun 11 '20 15:06 asifmallik

I've put in a PR to fix just the problem with cookie domain not being stored in session file - hoping to publish an authentication plugin for a specific server type of authentication which needs auth stored in cookies to work acrosss sessions, so hoping to get this PR accepted. Without the fix I get endless auth loops when resuming a session where the authentication is expired.

barny avatar Feb 28 '22 08:02 barny

Thanks for the PR @barny! Unfortunately we can't accept it as of this moment, since this issue has multiple points that we'd like to fix (most importantly change how cookies are represented in the session files) to take our sessions to a next level. We are actively working on it, and hopefully it will be part of the upcomiing release.

isidentical avatar Feb 28 '22 08:02 isidentical

@isidentical Is there any chance that can merge in the meantime while a larger overhaul is being performed? I just realized all my issues with using httpie for an auth-flow that I'm working on are stymied by this issue with domains. It took a bit of digging, but when I plugged @barny's solution into the sessions.py file, it worked like magic.

... I guess I can just run with this local custom patch, but I also prefer not messing with my libraries locally šŸ˜›

adamtaylor13 avatar Mar 08 '22 23:03 adamtaylor13

Hey @adamtaylor13! We've just released 3.1.0 with the new sessions layout, please give it a try (we still haven't made the official annonucement yet, but the package is already on PyPI/Snap/Brew) if you can!

isidentical avatar Mar 09 '22 10:03 isidentical

@isidentical Oh... Fantastic! Yeah, I just upgraded to 3.1.0 and that seemed to do the trick! Big thanks! šŸŽ‰

adamtaylor13 avatar Mar 09 '22 13:03 adamtaylor13

Glad you liked it! @adamtaylor13

isidentical avatar Mar 09 '22 14:03 isidentical