mbedtls icon indicating copy to clipboard operation
mbedtls copied to clipboard

Make key locking and one-shot operations thread safe

Open Ryan-Everett-arm opened this issue 1 year ago • 5 comments

Description

Make the key "locking" system and one-shot PSA operations thread safe. Resolves #8676.

The work of this PR and the issues #8675, #8677 and #8412 is to ensure that API calls are linearizable and correct. To do this we rely on some properties, these properties need to be enforced. The relevant properties are:

  • All calls to psa_register_read, psa_unregister_read, psa_key_slot_state_transition, psa_wipe_key_slot and psa_key_slot_has_readers are performed under the global mutex.
  • Every API call has at most one point in time where effects are made visible to other threads. By effects here we mean something like a key being created and now existing in the eyes of other threads, or a key being destroyed. We do not need to consider resource constraints here.
  • Any function which iterates through key slots or finds a key slot containing X ignores all slots in the FILLING or PENDING_DELETION state. Whoever sets a slot to FILLING directly or via psa_reserve_free_key_slot has exclusive access to the slot. A slot in PENDING_DELETION can only be read by the registered readers, and must be wiped upon the final reader unregistering.
  • If a slot in the FULL state has more than 1 registered reader then it cannot be wiped (with the exception of a call to mbedtls_psa_cypto_free, a non-threadsafe function).
  • No key can be in multiple FULL slots at the same time, each FULL slot must hold a unique key ID.
  • Other threads running other API calls mustn't interfere in a destructive way.

(See the thread safety document for more details.) This PR must make one shot operations abide by these properties.

Note for reviewers

The first 5 commits of this PR modify the psa_get_and_lock_X functions to make them thread safe. Operations pass a key ID into these functions, and are given a pointer to the slot which holds this key. This slot is in a FULL state, and they are registered as a reader. Once these functions are thread safe, the changes in the 6th commit make all of the one-shot operations thread safe. The intention is for this PR to be reviewed one commit at a time. The commit descriptions contain specific properties of the code which I believe are enforced; these properties are what my locking strategy is based on.

When reviewing the final commit, please check that each operation abides by the LOCK(slot); read from slot; UNREGISTER(slot) pattern. psa_copy_key is the only exception as it also writes to another slot, it uses the functions made threadsafe by #8675.

Ensure that single-shot operations cannot write to any slot (except for psa_copy_key and writing in the locking/unregistering call), or read from any slot which they are not a registered reader of.

PR checklist

Please tick as appropriate and edit the reasons (e.g.: "backport: not needed because this is a new feature")

  • [x] changelog not required for individual tasks on this epic
  • [x] backport not required
  • [x] tests To be added in #8420

Ryan-Everett-arm avatar Jan 31 '24 14:01 Ryan-Everett-arm

I used Github to try to bring this fork up to date with upstream/development (no conflicts) but it created a merge PR. Sorry about this. @paul-elliott-arm should I force push this away or do you have pending review comments which this would wipe?

Ryan-Everett-arm avatar Feb 12 '24 10:02 Ryan-Everett-arm

I used Github to try to bring this fork up to date with upstream/development (no conflicts) but it created a merge PR. Sorry about this. @paul-elliott-arm should I force push this away or do you have pending review comments which this would wipe?

Nono, that's fine, and to be expected. You will know if something has gone horribly wrong, as the diff will likely include other files, and generally be a lot bigger than you remembered!

paul-elliott-arm avatar Feb 12 '24 12:02 paul-elliott-arm

Looks pretty good to me, however I am curious as to why the psa_unregister_read() in psa_destroy_key() remains unguarded.

psa_destroy_key was covered by #8764, do you mean a different function? All unregister_read() calls must be performed under the mutex.

Ryan-Everett-arm avatar Feb 13 '24 16:02 Ryan-Everett-arm

Looks pretty good to me, however I am curious as to why the psa_unregister_read() in psa_destroy_key() remains unguarded.

psa_destroy_key was covered by #8764, do you mean a different function? All unregister_read() calls must be performed under the mutex.

Looking at psa_destroy_key() at the head of this PR, there are no mutex locks in it, and the psa_unregister_read() is the non-mutex variant, however I now see that this was not including changes from the previous PR. This is kinda making integration checks kinda difficult however.

paul-elliott-arm avatar Feb 14 '24 12:02 paul-elliott-arm

Looks pretty good to me, however I am curious as to why the psa_unregister_read() in psa_destroy_key() remains unguarded.

psa_destroy_key was covered by #8764, do you mean a different function? All unregister_read() calls must be performed under the mutex.

Looking at psa_destroy_key() at the head of this PR, there are no mutex locks in it, and the psa_unregister_read() is the non-mutex variant, however I now see that this was not including changes from the previous PR. This is kinda making integration checks kinda difficult however.

This is one downside to spliting the work, but for the sake of keeping PRs small and reviewable I think it is worth it. Just to clarify in case it is unclear - the psa_unregister_read_under_mutex function is only called in situations where one doesn't already hold the mutex, and only requires the mutex for the unregister_read call. It is defined in this PR and not the others since all key loading/destruction functions which call psa_unregister_read need to hold the mutex for a longer portion of the code, so they can't use this new variant.

Ryan-Everett-arm avatar Feb 14 '24 12:02 Ryan-Everett-arm

In the documentation above psa_get_and_lock_key_slot_with_policy() it says

 * It is the responsibility of the caller to call psa_unregister_read(slot)
 * when they have finished reading the contents of the slot.

Should that now be

 * It is the responsibility of the caller to call psa_unregister_read(slot)
 * or psa_unregister_read_under_mutex(slot) when they have finished reading
 * the contents of the slot.

Or should it be

 * It is the responsibility of the caller to call
 * psa_unregister_read_under_mutex(slot) when they have finished reading the
 * contents of the slot.

tom-cosgrove-arm avatar Feb 20 '24 14:02 tom-cosgrove-arm

In the description it says

All calls to psa_register_read, psa_unregister_read, psa_key_slot_state_transition, psa_wipe_key_slot and psa_key_slot_has_readers are performed under the global mutex

but there are a number of calls to bare psa_unregister_read() (not psa_unregister_read_under_mutex()) left. Will those be done in a subsequent PR?

tom-cosgrove-arm avatar Feb 20 '24 16:02 tom-cosgrove-arm

In the description it says

All calls to psa_register_read, psa_unregister_read, psa_key_slot_state_transition, psa_wipe_key_slot and psa_key_slot_has_readers are performed under the global mutex

but there are a number of calls to bare psa_unregister_read() (not psa_unregister_read_under_mutex()) left. Will those be done in a subsequent PR?

See my comment on this. all of these are sorted out by #8764, in that they are already wrapped by the mutex.

paul-elliott-arm avatar Feb 20 '24 16:02 paul-elliott-arm

To answer myself: they are covered by calls to psa_get_and_lock_key_slot() and psa_get_and_lock_key_slot_with_policy(), which call mbedtls_mutex_lock() (which I did check earlier, but missed!)

tom-cosgrove-arm avatar Feb 20 '24 16:02 tom-cosgrove-arm

In the description it says

All calls to psa_register_read, psa_unregister_read, psa_key_slot_state_transition, psa_wipe_key_slot and psa_key_slot_has_readers are performed under the global mutex

but there are a number of calls to bare psa_unregister_read() (not psa_unregister_read_under_mutex()) left. Will those be done in a subsequent PR?

The list of properties in the description are what we need to enforce once this, #8744, #8764 and #8833 are merged. Between these 4 PRs all calls to psa_unregister_read are accounted for and protected.

Ryan-Everett-arm avatar Feb 20 '24 16:02 Ryan-Everett-arm