jwt icon indicating copy to clipboard operation
jwt copied to clipboard

Unencrypted tokens and parser

Open pkly opened this issue 3 years ago • 17 comments

Hi, first off, love the lib. It's even better in 4.x than it was in 3.x.

One thing bothers me a bit though, since the interfaces moved around and claims() got moved to the UnencryptedToken interface it left a weird gap in the Parser interface, since it's not possible to get a "proper" decoded token interface via the default implementation. Yeah, it's returning an unencrypted plain token, but it's a bit weird that you can't simply force that.

I would like to propose a solution to this problem, just like there's a Token interface and UnencryptedToken interface, maybe let's have Parser and UencryptedParser interface? The default implementation could stay the same, the UnencryptedParser interface would simply add a new method - parseUnencrypted(string $jwt): UnencryptedToken or something like that.

There'd be no modification required on the Parser class, simply changing the interface and having both methods return the same object.

pkly avatar Feb 22 '22 10:02 pkly

What I'm kinda missing is what we're trying to achieve: what is the use-case, at high level/architecture?

Ocramius avatar Feb 22 '22 10:02 Ocramius

We have our own oauth2 server which uses the jwt tokens for access. Sometimes we need to decode it to check for a few things, including ids. This is only possible by using claims, which are not available on the Token interface, but were easier to access in the 3.x library.

pkly avatar Feb 22 '22 11:02 pkly

Ah, so you mean that there should be a parser that guarantees that a non-encrypted token is returned.

Perhaps it would be a good idea to have an UnencryptToken#__invoke(Token|UnencryptedToken|EncryptedToken $token): UnencryptedToken, that you can funnel tokens through, and that always guarantees a cleartext token to be usable?

Ocramius avatar Feb 22 '22 13:02 Ocramius

I don't think it can always be assumed that a theoretically encrypted token could be decrypted without the use of a service with some configuration

pkly avatar Feb 22 '22 13:02 pkly

That would be a service :)

Ocramius avatar Feb 22 '22 13:02 Ocramius

Oh, sorry, I misread. Yeah I suppose that would also work.

pkly avatar Feb 22 '22 13:02 pkly

@lcobucci WDYT? That would solve/remove the problem of casting a Token into an UnencryptedToken, which currently needs to be done via assert(...), in order to be type-safe, such as in:

https://github.com/lcobucci/jwt/blob/5f1fbfd107c0b877b151f4d73f3353805ab8312a/src/JwtFacade.php#L37-L45

Ocramius avatar Feb 22 '22 13:02 Ocramius

There are some oddities around, IMHO. I'm just dealing with creating tokens, so after reading the docs I wondered: How can they suggest calling claims() without asserting what is returned from the builder? And looking at the Lcobucci\JWT\Builder interface, I was even more puzzled: public function getToken(Signer $signer, Key $key): Plain; - Plain is the implementation of UnencryptedToken, but why is an interface responding with an implementation, and not with an interface for that implementation?

So the only available builder is explicitly forced to return a Plain implementation, and the only available parser is forced to return the most generic Token interface, but internally also only returns Plain for now. I understand that maybe this is reserving room for encrypted tokens somehow, or any other type of token that are not exactly UnencryptedToken, but what can be in between the two? UnencryptedToken::claims() is returning any claims a token may have, while Token has plenty of specific validator methods that also simply access that claims dataset object in the only implementation - and technically they are simply put into the same location in the JSON string.

So there is this unexplained gap, and having a dedicated method that explicitly returns UnencryptedToken from the parser gets +1 from me.

SvenRtbg avatar Feb 22 '22 15:02 SvenRtbg

Yes, that seems rather weird as well. I'd suggest just returning UnencryptedToken in both instances.

pkly avatar Feb 22 '22 17:02 pkly

from the parser

It would be a separate object :)

Ocramius avatar Feb 22 '22 18:02 Ocramius

Folks, thanks for sharing your points! It's the feedback I wanted to receive during the beta/rc releases of v4.0.

And looking at the Lcobucci\JWT\Builder interface, I was even more puzzled: public function getToken(Signer $signer, Key $key): Plain; - Plain is the implementation of UnencryptedToken, but why is an interface responding with an implementation, and not with an interface for that implementation?

The new interfaces were created in v4.0 and the token was designed to be just a value object - which is generally okay for an interface to depend on. UnencryptedToken was only introduced in v4.1 - and we didn't want to change the signature of the interface (even though it's not technically a BC-break since types are still compatible).

One thing bothers me a bit though, since the interfaces moved around and claims() got moved to the UnencryptedToken interface it left a weird gap in the Parser interface, since it's not possible to get a "proper" decoded token interface via the default implementation.

This is similar to the discussion in #814.

I didn't have the time to finish that yet (which is why #814 is also lingering) but I'm documenting the JwtFacade and thinking a lot about the gap you're describing.

To me, the JwtFacade is exactly what we should use to solve it - and we get the benefits of a safer usage of tokens. That API already provides the necessary guards to always return unencrypted tokens, so I wonder why do we need to introduce yet another one.

The effort of implementing that kind of parser is really minimal:

use Lcobucci\JWT\Encoding\JoseEncoder;
use Lcobucci\JWT\Token\Parser;
use Lcobucci\JWT\UnencryptedToken;

final class UnencryptedTokenParser implements \Lcobucci\JWT\Parser
{
    private function __construct(private readonly \Lcobucci\JWT\Parser $parser) {}

    public static function create(): self
    {
        return new self(new Parser(new JoseEncoder()));
    }

    public function parse(string $jwt): UnencryptedToken
    {
        $token = $this->parser->parse($jwt);

        if (! $token instanceof UnencryptedToken) {
            throw InvalidTokenStructure::tokenIsEncrypted();
        }

        return $token;
    }
}

My question is: is this really the direction we want to go?

I kept this forward compatibility layer with the hope of supporting encryption without having to break BC (#6 is open for years and years). However, what I'm getting from users is: we don't really need encryption.

If that's the case, we should aim at simplifying things (even if that would require a v5 soon - which is completely fine btw) instead of introducing more and more components.

lcobucci avatar Feb 22 '22 21:02 lcobucci

I'm not suggesting a new Parser implementation. I am suggesting a new interface:

interface DecryptToken {
    public function __invoke(Token $token): UnencryptedToken;
}

This is kinda important due to parsing and decrypting being different planets 😁

Ocramius avatar Feb 22 '22 22:02 Ocramius

If in 3 everything including claims was contained within the token itself I don't really see why the split was necessary between Token and UnencryptedToken (if there was one? unsure, as I didn't really delve that deep into it before converting our code to v4).

I don't really mind either solution, either merging interfaces or having a separate one. If we go the DecryptToken interface route we could probably keep compatibility in 4 without worrying much about BC breaks, the default implementation on Parser could just check if the token is actually unencrypted, else throw a DecryptionFailedException or something, since obviously support at the moment is not necessary. If someone would need a decryption layer for their tokens they could simply implement the interface and use their class.

On that note, I wonder if using __invoke is the best idea, but that's just a minor issue.

@lcobucci I don't really like the idea of changing the return type on an implementation itself since if you'd use something like symfony you could create a service and simply bind the interface as a service - you'd still get the same issue we're having right now, since the implementation itself is not important, the interface is.

pkly avatar Feb 23 '22 07:02 pkly

I have one thought related to the possible future situation with this library.

You may remember when basically all JWT libraries had a serious security flaw because they unconditionally accepted "none" as the signature method announced in the header, so an attacker might just select that schema, omit the third part with the signature, and has full control over the data part and any claims.

So now imagine that every user of this library is accustomed to "the parser returns UnencryptedToken anyways", and suddenly this assumption is not true anymore because somehow an EncryptedToken is returned. Technically attackers would be able to pass about any string into the parser. Or maybe the other way around, code that might be expecting an encrypted token is now getting an unencrypted variant.

The key learning from that security incident for me is: Ask for things very specifically, state what your expectations are in the code to allow detecting deviations early. This means that even if both types of token would be inheriting from the same parent interface, I'd still expect the parser to be asked to work on the two different things by calling two distinct methods. Maybe even have different parser implementations. What I'd avoid is to have one generic "parse it all" method that then has to use third party information, guesses or black magic about what type of token may have been passed.

So if at all encrypted tokens would be implemented, they should be treated differently from the start, so any thoughts about maybe make them seem to just be Token might not be helpful for the use case anyway.

SvenRtbg avatar Feb 23 '22 19:02 SvenRtbg

My 2 cents. UnencryptedToken, EncryptedToken... these terms are misused because obviously no encryption is done here. Signed tokens are just encoded into base64 url safe but not encrypted. It should simply be Token/SignedToken, ParsedToken or similar, but encrypted should not used at all unless JWE is supported

Oxmoze avatar Feb 23 '22 20:02 Oxmoze

The point is differentiating JWE from the other tokens: no confusion between signing and encrypting.

Ocramius avatar Feb 23 '22 20:02 Ocramius

The library currently has (string $token): Token) and (string $probablyUnencryptedToken, SignedWith $signedWith, ValidAt $validAt, Constraint ...$constraints): UnencryptedToken. It is clear to me that there is a gap between those, and I think there is a use case for a simple function with the signature (string $probablyUnencryptedToken): UnencryptedToken.

No custom implementation, parse + decrypt, dependency injection, dummy implementation of validation rules, or any other complication should be necessary to do this. Higher level APIs that are either more flexible, more opinionated or different in any other way can coexist with this.

Ask for things very specifically, state what your expectations are in the code to allow detecting deviations early.

I agree with @SvenRtbg's argument, adding a specific parser would achieve that.

spawnia avatar Feb 27 '22 09:02 spawnia