pyFF icon indicating copy to clipboard operation
pyFF copied to clipboard

store.py match("^(.+)=(.+)$", key) lookup breaks entityID's with base64 encoded, padded content

Open mrvanes opened this issue 7 years ago • 12 comments

In store.py @270 there is logic to recognise "=" separated keys. We have entityID's that embed base64 encoded information in entityID, one of them ending in the base64 padding "==". This causes _lookup to miss the entityID, causing the discovery page to break with a 404 in mdx.py @588 for this SP.

The following regex solves the problem, but I'm unsure if it would break the intended "=" situation:

m = re.match("^(.+)=([^=].+)$", key)

And I'm also unsure if it solves the single padded base64 encoded strings, let alone entityID's where the padding is in the middle of the entityID. That would certainly break anytime.

mrvanes avatar Jun 13 '17 12:06 mrvanes

Can you provide a sample URL that breaks?

leifj avatar Jun 13 '17 13:06 leifj

Also feel free to provide a PR for quicker review but that URL would make a good testcase...

leifj avatar Jun 13 '17 13:06 leifj

Sample entityID URL (obfuscated the domain)

https://satosa.***.nl/Saml2MirrorSP/proxy_saml2_backend.xml/Saml2IDP/aHR0cHM6Ly9zcC4qKioqLmNvbS9zYW1sL21vZHVsZS5waHAvc2FtbC9zcC9tZXRhZGF0YS5waHAvZGVmYXVsdC1zcA==

I don't want to create a pullrequest because I'm far from certain that my (quick) fix doesn't break any other intended behaviour. I'd rather you scrutinize the problem and rethink the "=" case to lookup { } logic. In my case the "... adding %s=%s" debug code resulted in an empty v, which is a telltale sign that something went wrong with the earlier rewrite of the "=" case.

mrvanes avatar Jun 13 '17 14:06 mrvanes

yeah this smells URL encoding fail to me somewhere - normally that == would/should have been URL-encoded before passing to pyFF because otherwize you could never do query-string parsing

leifj avatar Jun 13 '17 19:06 leifj

Yes, normaly it's URL encoded, but I think (have to check) that this will be decoded inside the server, so still be a problem?

mrvanes avatar Jun 14 '17 09:06 mrvanes

Here's proof:

[14/Jun/2017:09:52:06]  memory store lookup: https://satosa.****.nl/Saml2MirrorSP/proxy_saml2_backend.xml/Saml2IDP/aHR0cHM6Ly9zcC5tcnZhbmVzLmNvbS9zYW1sL21vZHVsZS5waHAvc2FtbC9zcC9tZXRhZGF0YS5waHAvZGVmYXVsdC1zcA==
[14/Jun/2017:09:52:06]  lookup: https://satosa.*****.nl/Saml2MirrorSP/proxy_saml2_backend.xml/Saml2IDP/aHR0cHM6Ly9zcC5tcnZhbmVzLmNvbS9zYW1sL21vZHVsZS5waHAvc2FtbC9zcC9tZXRhZGF0YS5waHAvZGVmYXVsdC1zcA==
[14/Jun/2017:09:52:06]  trying null index lookup https://satosa.****.nl/Saml2MirrorSP/proxy_saml2_backend.xml/Saml2IDP/aHR0cHM6Ly9zcC5tcnZhbmVzLmNvbS9zYW1sL21vZHVsZS5waHAvc2FtbC9zcC9tZXRhZGF0YS5waHAvZGVmYXVsdC1zcA==
83.85.149.243 - - [14/Jun/2017:09:52:06] "GET /role/idp.ds?entityID=https%3A%2F%2Fsatosa.****.nl%2FSaml2MirrorSP%2Fproxy_saml2_backend.xml%2FSaml2IDP%2FaHR0cHM6Ly9zcC5tcnZhbmVzLmNvbS9zYW1sL21vZHVsZS5waHAvc2FtbC9zcC9tZXRhZGF0YS5waHAvZGVmYXVsdC1zcA%3D%3D&return=https%3A%2F%2Fsatosa.****.nl%2FSaml2MirrorSP%2Fdisco HTTP/1.1" 200 6438 "" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.86 Safari/537.36"

Original request was URL encoded, but lookup debugs the decoded entityID.

mrvanes avatar Jun 14 '17 09:06 mrvanes

Yeah you're right. I need to think about the right way to deal with this. Your patch at least solves one part of this so I don't mind applying it but this actually needs thought...

leifj avatar Jun 14 '17 11:06 leifj

From time to time we're trying to see if running unmodified vanilla pyff would be a viable option, but bugs like this not being patched upstream make it hard. Any progress on the "thoughts" about this? Do you want a PR for the suggested insertion of the four characters?

mrvanes avatar Aug 26 '19 12:08 mrvanes

Is it still there in the wsgi api?

Skickat från min iPhone

26 aug. 2019 kl. 14:38 skrev Martin [email protected]:

From time to time we're trying to see if running unmodified vanilla pyff would be a viable option, but bugs like this not being patched upstream make it hard. Any progress on the "thoughts" about this? Do you want a PR for the suggested insertion of the four characters?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

leifj avatar Aug 26 '19 19:08 leifj

if it is still in HEAD I'd appreciate a PR!

leifj avatar Aug 26 '19 20:08 leifj

ping @mrvanes what should we do about this?

leifj avatar Mar 30 '21 22:03 leifj

  1. We don't use base64encode entityID's anymore, so we don't experience the problem anymore, but
  2. The bug is real: any valid entityID containing = characters will trigger it

Will that happen often? I doubt it.

mrvanes avatar Apr 01 '21 11:04 mrvanes