kanidm icon indicating copy to clipboard operation
kanidm copied to clipboard

CLI: Persist URL of instance together with session information

Open mart-w opened this issue 1 year ago • 17 comments

Is your feature request related to a problem? Please describe

I have accounts on multiple Kanidm instances, and have logged into sessions with all of them through the Kanidm CLI. Whenever I call a CLI command and omit the -D flag, I get an interactive prompt which asks me which of my active sessions I would like to use to run this command, which is my preferred flow rather than having to append my SPN to every command. However, I still can’t get around passing the -H flag to specify the instance if the session I’m trying to use isn’t related to the instance I have specified in my Kanidm config. This is especially cumbersome as I find URLs harder to remember and slower to type than user names or even SPNs.

Describe the solution you'd like

I would love if, together with the SPN and session token, the Kanidm CLI would also persist the URL of the corresponding Kanidm instance for each active session. This way, the -H flag would only have to be passed once during login (or not at all if the instance’s URL is set as default in the config file). After logging in, choosing the correct session would only require either interactive selection or passing the SPN using the -D flag, improving the user experience.

Describe alternatives you've considered

The option to specify alternative environments in the config file alleviates the issue somewhat, as instead of having to type a URL, you can choose an arbitrary identifier which can be easier to type. Still, I think that caching the URL as soon as a session is established would be the most user-friendly solution.

mart-w avatar Jan 08 '25 23:01 mart-w

We actually have profiles for this

https://kanidm.github.io/kanidm/stable/client_tools.html#multiple-instances

This way you can just export an env variable at the start of a session to define which instance you want to use, so you don't a have to type -H a whole bunch :)

Hope that helps,

Firstyear avatar Jan 09 '25 00:01 Firstyear

True, that does actually make it a bit more convenient than using the --instance flag, thank you! However, it still requires a conscious effort to set this in your terminal session, and it doesn’t persist across terminal sessions. It also requires configuration for each instance you want to use this for.

In my mind, I could log in to a number of accounts across Kanidm instances, and anytime I run a command where I don’t explicitly specify which account to run that with, it lets me choose the right account and seamlessly infers the proper host for that account from the stored session. I feel like we’re already 90 % of the way there with the interactive session prompt, and that would take the last edge off of working with multiple accounts across instances. In fact, this seems so intuitive to me that I assumed that it was already working this way, causing me quite a bit of confusion when I interactively selected a Kanidm session from another instance and got nondescript authentication errors.

The only situation where this could lead to issues is when the same instance is only reachable through different URLs that are specific to certain network environments, making it necessary to choose a different URL depending on circumstances. But I feel like that would be more the exception than the rule, and in those cases, falling back to the current mechanisms would probably be fine.

mart-w avatar Jan 09 '25 03:01 mart-w

Actually, without knowing the codebase myself, this sounds like something that wouldn’t be hard to implement. I don’t have a ton of Rust experience, but do you think this could be a good first issue for me to work on?

mart-w avatar Jan 09 '25 03:01 mart-w

I'm trying to think about how this might affect the user experience. The thing is that per instance we aren't just storing the url but the jwk that signs the tokens too. Currently we don't have a way to revoke a jwk or get it automatically, but that's something we should improve and have.

So were we to have "one instance, many servers" were now crossing some streams where the jwk that signs one instances tokens, now affects another instance.

I think the need to configure the instances is not a really bad thing, since you know all your urls ahead of time, so what stops you adding them all?

Anyway, really what you're kind of suggesting here would turn a lot of our current session handling upside down - rather than an instance that has users, we have users that relate to an instance. So there would be one giant token store for all users, rather than a token store per instance.

And perhaps that's the better experience because then you would see a list like "[email protected]" and "[email protected]" etc. Where it would get a bit tricky is how we configure urls in the client config. We'd effectively need a default instance like we have now, but then a way to map an spn domain to the instance url in the config. Or a way to say "on first use" when you use [email protected], if you have -H then we can store that data. But that also means we need a way to edit that data and clear it out too. if the user gets the url wrong or something changes and they want to associate new.domain to a different url.

Where this scheme of "instances" like we have now works really well is you can have different instances for "load balancer" and "each server" of a domain, but each of them are separate instances with separate tokens for debugging purposes.

I'd say if you want to have a go at this, we need to think about a design doc, the user experiences, and how to handle those. Then once we have that in mind, we can consider if we want to make this change or not.

Not saying your idea is bad btw - it's just different, and that has pros-and-cons, and we need to weigh those to decide what's going to best for users.

Reference - here is the token and client setup code.

https://github.com/kanidm/kanidm/blob/master/tools/cli/src/cli/session.rs#L31

https://github.com/kanidm/kanidm/blob/master/tools/cli/src/cli/common.rs#L100

Firstyear avatar Jan 09 '25 03:01 Firstyear

Okay so this response got a lot longer than I anticipated. Here, have some headings!

My Experience

Interesting! As so far I haven’t used the instance configuration feature much but instead relied on -H most of the time, I didn’t realise that instances are not just a mere alias for a set of config options, but do in fact provide different scopes for the session store!

When I configure multiple instances and switch between them through either the environment variable or the --instance flag, the list of sessions I get either in the interactive selector or from kanidm session list depends on the instance that is currently selected – I’m only presented with the sessions I opened in the context of the currently selected instance. When I still exclusively relied on -H, however, everything happened in the context of the default instance, meaning that the session list did in fact contain sessions from different domains. In my particular case, I would be presented with the choice between [email protected] and [email protected], even though only one of them would actually be usable without -H because the other one was tied to a different host with a different URL.

Now that I see the data structure behind it, I can see how it’s supposed to work – and where I went down the wrong path. The whole reason I even connected to that other instance through the CLI wasn’t that I’m an administrator wanting to work on a different domain or access a particular host in a replicated and load-balances Kanidm deployment. I went there as a user who wanted to change their displayname. As such, I didn’t go out and look whether there is documentation on configuring multiple instances – it didn’t even cross my mind to even change anything about my configuration. All I did was run kanidm help login, see that the flag for using a different URL is -H, used that flag, and logged in. And the confusion started when I continued to run kanidm person update --displayname ..., interactively picked the right account from the two accounts in two different domains that were presented to me, and it didn’t work without another -H. It seemed to work exactly how I intuitively assumed it to work at this point, without realising that my mental model didn’t match the implementation and I should’ve configured a separate instance all along.

In my case, I feel like the change I suggested would have greatly improved my user experience – it would have reduced the number of situations where I had to explicitly specify the host using the -H flag to one, and it would have avoided the confusion when Kanidm interactively offered me a session that, when selected, would simply lead to an error.

About JWK Storage

The JWK store didn’t get in my way at any point, by the way – inspecting the kanidm_tokens file shows that the default instance is simply associated with the keys of both instances now. It might have security implications if we make the assumption that each instance from the client perspective maps to exactly one Kanidm deployment out there, as suddenly a server might be able to sign messages on behalf of another if keys from different servers end up mixed up, but you’re probably much better at gauging that than I am. In any case, this issue isn’t directly linked to the feature I proposed as you can already easily run into it now.

Proposal

Now that I know how instances are supposed to work, I do see the value in those. I also think that we wouldn’t need to scrap that idea and that we can have the best of both worlds. The way I understand instances work at the moment from the client side, they’re more like scopes rather than representation of individual server-side instances / deployments / servers/whatever you might call it.

What if we embraced that thought and actually called them scopes instead of instances, with each scopes having its own set of configuration options and its own list of sessions, and which each session being associated with a server URL? Each scope could still have a configured "default URL" – like instances do now – to make interacting with one’s "main instance" more convenient. The default scope could then be the place where all your everyday sessions live: For your home lab, your University, your workplace, maybe your friend’s cloud, and switching between those is as easy as using the -D flag, the appropriate environment variable or the interactive selector. Additional scopes can then be configured to represent special cases and environments, like the example you gave where you want to bypass a load balancer to work on a replicated host directly. This would give advanced users and administrators the option to set up additional scopes as they need them, but keep that complexity out of the way of everyday users.

Compatibility with Current Use Cases

I think that this could be achieved while preserving the vast number of use cases. Currently, URLs provided through flags or the environment take precedence over the instance configuration. If we added session-based URL inference in the middle so that flags and the environment take precedence over the session data and the session data takes precedence over the scope configuration (formerly instance configuration), the only scenarios that would break would be the ones where users change their instance URL through the configuration file. The only two scenarios that I can come up with where this would happen are server migrations and situations where people keep editing the config file in order to connect do different servers. Maybe you are aware of more scenarios.

Switching Environments Through the Configuration File

The latter scenario would probably be easily covered: Users who do that are probably advanced users who would be able to adapt their use cases to using either flags, environment variables, or multiple scope configurations. In interactive use of the CLI client, I don’t see much of an obstacle here, but our change may break scripts that rely on being able to change the host based on the configuration file. I don’t know how big a risk that really is and if it requires more action than a notice in the changelog. What do you think?

Migrations and User Error

The migration scenario is the one we have to think about more thoroughly, as it is one that also average users may be confronted with. However, this one comes with the advantage that the old server will tend to not be reachable anymore, so we get the chance to catch the misconfigured URL and throw an error message that suggests actions to solve the problem. We could implement a new command to change the URL associated with a given session, but I think the solution could be much easier than that.

If we update the URL stored in the session information after each successful login or reauth or even after every interaction with the server, changing it would be as easy as running any command with the -H flag once. We could advise the user to repeat their last command with just the -H flag and proper URL appended, and the issue would be resolved. Handily, this would also solve the issue you mentioned where a user manages to accidentally get their URL wrong. In this case, they could simply go back in their terminal history, fix the issue with the URL in-place, and execute the fixed command again.

Deletion of Session-Bound URLs

The only use case this would not cover is being able to erase the URL associated with a given session. On the other hand, I personally don’t really see a use case for that, anyway – the only situation that I can see is that the user might not want Kanidm to cache their last URL, either for reasons like plausible deniability or because they don’t like the thought of the program making decisions on their behalf. But in this case, I think the better course of action would be to implement an opt-out that prevents this caching behaviour to occur in the first place, as otherwise they would have to keep coming back to clear out the cached URL.

mart-w avatar Jan 09 '25 18:01 mart-w

This is a really thoughtful and well written reply, thank you!

I think renaming "instances" to "scopes" might be a good option here. It does make a lot of sense. Then we can potentially combine both ideas.

So let's say a scope is a self contained store of tokens/jwks etc. That way swapping "scope" just isolates what tokens are there and present, and what url is the default. I think that's reasonable.

Now lets think about within a scope. I think we should adapt our JWK checking behavior such that on first use of a URL, we associate the URL to a JWK that we retrieve from the server. Effectively we have a map of URL -> JWK for client side validation. We need to think about how we refresh those JWK's too.

Then next we have "tokens". The question here is how we handle tokens with multiple URLs. Lets say I have replica one and two, my token should be valid on both, even though they are separate URLs (but have the same JWK). But we also don't want to send my token to "naughty server" either, because that's a potential token leak. We want to only send a token to a server, if we know that servers JWK is the one that issued it.

The other approach is make tokens unique to a URL, but then I can have duplicate tokens in my store, so it's not clear which token belongs to which URL.

So I think we need to work out that side of how to approach this in a usable way. That way you could have a scope that can store tokens from many "instances".

Firstyear avatar Jan 10 '25 00:01 Firstyear

Okay, another thought. A scope is the store. An "instance" is the name of a server/origin, and an instance may have many URL's. The "instance" defines the JWK that validates it's tokens.

So when you have an instance, it may have multiple URL's which we can choose from, or the instance only has one URL which is used as the default always for that instance.

Then tokens are associated to an instance. That way the token can be used with any URL of that instance, but won't leak to another instance.

I'm not sure how I'd code this, but this kind of approach may satisfy what you want.

Firstyear avatar Jan 10 '25 00:01 Firstyear

An "instance" is the name of a server/origin, and an instance may have many URL's

Perhaps use the existing terminology of 'domain' here? 😄

yaleman avatar Jan 10 '25 01:01 yaleman

Yes, let’s call it domain to not confuse it with the current definition of "instance".

I think it would be better to map JWKs to domains instead of URLs. Not only is this more representative of the real world, I think it would also make it easier to verify new URLs.

So every scope would have a set of domains, every domain would have a JWK and a set of URLs, and every token (what is what I so far called a session) would have a domain associated to it. Alternatively, it could also have the URL associated to it, from which we could infer the domain – that would then allow multiple tokens for the same domain in the same scope but which lead to different hosts out there. It would be closer to the way I imagined it, where Kanidm just picks the last used URL for any given account, at the cost of convenience if you really are switching hosts a lot for the same token.

Either way, the question is how we register those URLs for a domain, right? For the URL that was used for login, that is trivial – it issued the token, after all. For any additional URL the Kanidm client encounters in the context of this domain, for example through the -H flag or through a menu item, setting, or environment variable, we’d have to perform an initial verification step to make sure the host is allowed to represent our domain. After that’s verified, we can add the URL to the list of trusted URLs for that domain and only then continue to send our token to authenticate and do whatever we were wanting to do.

The easiest option would be to have the user perform another login or reauth command. If I understand it correctly, every token is signed with the domain’s JWK, so if we get it to issue a new token for us, we can verify that against the JWK we already know about and thus prove that both the old and new tokens were issued by the same authority. That would probably be enough to consider the new URL trusted. The advantage of this approach would be that it should be relatively straightforward to implement and requires no server-side changes. It does disrupt the flow, however, as we have to prompt for a reauth every time we encounter a new URL for a given domain, even if the session is technically still valid. On the other side, this would really only happen once per URL, and reauth prompts are already a common occurrence in Kanidm, particularly when doing write operations. I think unless our users constantly add new URLs to their domains, which I find unlikely, it’s not really going to impact the user experience negatively.

If we did decide that this were too much of a hassle, though, we could probably do some sort of challenge-response verification, where we send the host behind the new URL a nonce that they have to sign, and we can then verify that signature. But this would add complexity to the server and probably require a whole lot of security considerations for only a small benefit. But it would allow us to seamlessly verify and add any new URL that the user happens to throw at us.

mart-w avatar Jan 10 '25 03:01 mart-w

Obviously, this still doesn’t solve the issue of JWK revocation and replacement. But I don’t think it would make it much more challenging, either. The only limitation is that, in case we implement a JWK rollout mechanism that works upon login or reauth or any connection to the server in general, we probably couldn’t connect to the domain through a new URL and also migrate JWKs. We would be barred from adding new URLs until our local JWK is updated to the new one, and then things should be back to normal.

I feel like that this scenario is edge-casey enough to gloss over for now. If it really were to happen that there was a JWK migration in progress and that was just the moment where someone decides to add a new URL, such as if an administrator wants to service a node directly during migration, I think it’s reasonable to either have them perform the migration first or to clear their JWK store and start afresh. This would likely not affect general users.

mart-w avatar Jan 10 '25 04:01 mart-w

So every scope would have a set of domains, every domain would have a JWK and a set of URLs

In your case, rename "scope" as domain... as in by the existing Kanidm sense. Which is a collection of systems running the Kanidm server. The domain is the part of the SPN after the @.

yaleman avatar Jan 10 '25 07:01 yaleman

No, this is not what I mean. In my mind, you can definitely have sessions from different domains accessible from within the same scope. For example, you might be logged in to both [email protected] and [email protected] at the same time, with both of them available through the -D flag or interactive session selector, and Kanidm automatically picking the correct host URL based on the domain. My definition of "scope" is what’s currently called an instance, without the implied assumption that one instance only covers one Kanidm deployment.

mart-w avatar Jan 10 '25 07:01 mart-w

My definition of "scope" is what’s currently called an instance, without the implied assumption that one instance only covers one Kanidm deployment.

Ok, that's making it less clear TBH.

If you implemented your scopes idea, that can identify and contact any number of domains, why have more than one scope?

yaleman avatar Jan 10 '25 11:01 yaleman

It would have less of a dominant role than now, but it would still be useful, for example for developers and administrators. For example, it would allow you to work on a staging environment, that obviously has different certs, JWKs and URLs than prod, but might still share the same Kanidm domain part as prod. For development, it would allow you to have a "dev" scope that doesn’t do certificate validation and always points to localhost. And even for the good old "I want to access one of my replicated Kani servers directly instead of through the load balancer", it wouldn’t be strictly necessary, but still provide a shorthand for switching the URL and possibly certificate settings.

It’s just non-technical users who’d likely never come into contact with the scoping system at all, even if they have accounts in multiple domains, but especially for those people, that’s a good thing.

mart-w avatar Jan 11 '25 14:01 mart-w

That makes sense, thanks for clarifying 😄

yaleman avatar Jan 11 '25 23:01 yaleman

@mart-w I think this sounds like a good structure, might be good to make a picture/digram of it, but otherwise, happy to review any PR's if you want to have a go at it :)

Firstyear avatar Jan 14 '25 03:01 Firstyear

@mart-w Is this solved through our instanced config settings we have in the client tools now?

Firstyear avatar Sep 12 '25 04:09 Firstyear