yubikey-val
yubikey-val copied to clipboard
Revoking a yubikey within a cluster
I have a cluster of validation servers and I would like to enabled/disable individual yubikeys cluster-wide. I found the /revoke endpoint, but that appears to only change the active
flag on the local database (it doesn't use the synchonization queue). I would have to hit /revoke
on every server in the pool. If a server happened to be offline when that request was made, there doesn't appear to be a way to maintain integrity across the cluster. The normal /sync
calls do not include this field.
I also discovered that the /resync
command skips synchronizing keys that have the active
flag set to false. So if I follow step 10 described here to add a new server to the pool, there will be a data mismatch if I try to verify an inactive yubikey on the new server.
Do you have any recommendations for how I should handle this? It looks like the active
/inactive
flag is a partially implemented feature that hasn't been thought out for clustered setups. Or maybe I'm just misunderstanding its purpose?
I'd say that your last statement is quite true, active isn't implemented enough to be usable with only this project. Yubico used to run a service for revoking keys in YubiCloud, that had it's own queue of revocations in case a server was down.
As the sync protocol looks right now the active flag isn't transmittable over it at all. I guess the easiest way to give something to handle new servers would be to let resync hit the revoke endpoint for inactive identities.
Here's what I ended up doing in case anybody stumbles across this issue:
I couldn't come up with a robust + secure way to use /revoke
. I didn't like that I had to rely on an IP whitelist for access instead of an API key with more fine-grained permission controls. I also ran into some edge-cases trying to implement synchronization due to the simplistic error handling in the /revoke
endpoint.
I decided to go in another direction and approached this from the KSM level. I'm already storing my AEAD keys in a MySQL database, which is being replicated across my cluster. So I added an active
flag to the AEAD table:
CREATE TABLE ykksm.aead_table (
public_id varchar(16) NOT NULL,
keyhandle INT NOT NULL,
nonce BLOB(6) NOT NULL,
aead BLOB(32) NOT NULL,
active BOOL NOT NULL,
PRIMARY KEY (public_id, keyhandle)
);
I added some logic to my KSM servers to check the active
flag when they try to load a key handle. If it's false, the KSM will return an ERR Disabled public_id
failure. This gets bubbled up in the yubikey-val server as a generic BAD_OTP
response.
This allows me to sidestep both of the issues that I outlined in my previous post. I've disable the /revoke
endpoint and from the validation server's perspective, every yubikey is active now. Our administrators can use the same internal tool to provision new yubikeys into our AEAD database, as well as enable/disable yubikeys if they need to be revoked.