gramine
gramine copied to clipboard
RFC: trusted files redesign
NOTE: I will not be implementing this; this is me leaving notes/plans for possible next implementer.
Summary
Make trusted files configurable per mount, and move handling to LibOS, similar to what I did for protected files (#371).
In addition, trusted files will be renamed to hashed files.
This has the following benefits:
- Architecture: PAL will be simpler (it will only handle regular files), and the feature will be usable (and testable) in all PALs.
-
UX: More consistent syntax for the user. The user will select one of three mounting modes (
passthrough
,hashed
,encrypted
).
Proposed changes
Old syntax:
fs.mounts = [
{ path = "/usr/lib", uri = "file:/usr/lib" },
]
sgx.allowed_files = [
"file:/usr/lib/allowed"
]
sgx.trusted_files = [
"file:/usr/lib",
"file:/usr/lib/libpython.so",
{ uri = "file:/usr/lib/libpython.so", sha256 = "..." },
]
New syntax:
fs.mounts = [
{ type = "hashed", path = "/usr/lib", uri = "file:/usr/lib" },
{ type = "passthrough", path = "/usr/lib/allowed", uri = "file:/usr/lib/allowed" },
]
fs.hashed_files = [
"file:/usr/lib",
"file:/usr/lib/libpython.so",
{ uri = "file:/usr/lib/libpython.so", sha256 = "..." },
]
-
sgx.trusted_files
is simply renamed tofs.hashed_files
(during the deprecation period, we respect both settings). -
gramine-manifest
computes the hashes when building*.manifest
(not only*.manifest.sgx
). After a deprecation period,*.manifest.sgx
will not be necessary! -
hashed
becomes a separate mount type (internally calledchroot_hashed
, same aschroot_encrypted
). All files under ahashed
mount have to be on thefs.hashed_files
list. If a host file is not on that list, or has a different hash, it's not visible on guest. (Alternatively, a different hash might produce-EACCES
). -
To allow accessing a file directly, mount it with
type = "passthrough"
. This replacessgx.allowed_files
. -
PAL still needs to load LibOS, which needs to be hashed. So instead of trusted files, we will supply a hash for it directly in manifest:
loader.entrypoint = "file:.../libsysdb.so"
loader.entrypoint_sha256 = "..."
-
There is no
sgx.file_check_policy = "allow_all_but_log"
. -
The default mount type (
chroot
) is deprecated, and it works in a sort of "compatibility mode":- if the PAL type is
Linux-SGX
, it respectssgx.trusted_files
andsgx.allowed_files
, - otherwise, it functions the same as
passthrough
.
- if the PAL type is
-
The root mount is still
type = "chroot"
by default, but a warning is displayed, encouraging the user to setfs.root.type = "hashed"
.
Implementation
I think the implementation can follow the encrypted files: there should be two layers, filesystem (handling inodes, handles etc.) and library (handling the actual logic of verifying trusted files).
FS layer:
- Create a separate filesystem (
chroot/hashed.c
). - Note that this filesystem is read-only: it handles read and stat, but not creat/chmod/truncate/unlink.
- Handle directories same as "normal" filesystem.
- In inodes, store a
struct shim_hashed_file
object. On lookup, use the library (see below) to create this object and verify file hash. - File handles contain a PAL handle, same as in normal
chroot
filesystem, but reading is done through the library (see below).
Library:
- This can be a module (
shim_fs_hashed.c
) similar toshim_fs_encrypted.c
. - This module doesn't use the rest of LibOS, just interfaces with PAL to load files.
- Suggested API:
/* `shim_fs_hashed.c` */
/* 256-bit hash of the whole file (SHA256) */
typedef ... file_hash_t;
/* 128-bit hashes of individual chunk (SHA256 truncated to 128 bits) */
typedef ... chunk_hash_t;
/* See TRUSTED_CHUNK_SIZE in current PAL */
#define CHUNK_SIZE (1 << 14)
/*
* Stores information about a single hashed file. Once created, this structure
* is immutable, so there's no need for locking when accessing it.
*
* See also `enclave_tf_structs.h` in current PAL.
*/
struct shim_hashed_file {
const char* uri;
uint64_t size;
/* Hash of the whole file */
file_hash_t file_hash;
/* Hashes of individual chunks */
chunk_hash_t* chunk_hashes;
}
/*
* Read `fs.hashed_files` into a lookup table (uthash?)
*/
int init_hashed_files(void);
/*
* Initializes a hashed file:
*
* - verifies the hash of the entire file,
* - computes chunk hashes,
* - creates and populates the `shim_hashed_file` object.
*/
int new_hashed_file(const char* uri, PAL_HANDLE handle, struct shim_hashed_file** out_hf);
void delete_hashed_file(struct shim_hashed_file* hf);
/*
* Reads data from file, verifying chunk hashes. `count` and `offset` should be aligned to `CHUNK_SIZE`.
*/
int read_hashed_file(struct shim_hashed_file* hf, PAL_HANDLE handle, buf, size_t count, file_off_t offset);
Note that this is somewhat inefficient:
- on file lookup (e.g.
stat
), it has to read the whole file in order to know its size, - on file read, it reads a whole chunk of data, even if the amount requested by user is smaller.
However, AFAIK this matches the current implementation in PAL. We can add caching if it ever becomes an issue.
Implementation plan
-
Implement the
hashed
filesystem as a separate feature (that looks atfs.hashed_files
). Add support togramine-manifest
tool: note that it will compute the file hashes when building*.manifest
, NOT*.manifest.sgx
. -
Implement
loader.entrypoint_sha256
for loading LibOS by PAL. -
Remove trusted files from PAL, route
sgx.trusted_files
etc. to the new code. Add deprecation notices.
Issues
-
The new design assumes
sgx.file_check_policy = "allow_all_but_log"
is not necessary. Is it? -
The target implementation is, IMO, simple, but the compatibility code (during deprecation period) will be more complicated, in order to match the current behavior.
-
The chunk hashes are SHA256 truncated to 128 bits, presumably to save space. Maybe we should just use 256-bit chunk hashes? (If the host can find a collision for the chunk hash function, it can replace the hashes after Gramine verifies the file)
The new design assumes
sgx.file_check_policy = "allow_all_but_log"
is not necessary. Is it?
This is necessary. People use it quite often.
I see. Do you think people use it for writable files as well?
Anyway, I'd suggest to add it as a mount parameter:
fs.mounts = [
{ type = "hashed", path = "/usr/lib", uri = "file:/usr/lib", policy = "allow_all_but_log" },
]
And, if we can afford it, this will treat unknown files as openable but still read-only.
I see. Do you think people use it for writable files as well?
I do use them a lot, especially when people want to quickly evaluate an existing app with Gramine and they can't give us the whole environment. I usually suggest them to use a very permissive manifest, let them run the app successfully in their environment. Then we'll go through the runtime log and come up with a more restrictive manifest for production.
Without the ability to read/write any file, the app will usually stop at the first violation, and after going back and forth several times, new beginners will fell Gramine is difficult to use, while it really isn't.
And, if we can afford it, this will treat unknown files as openable but still read-only.
Yes, I think this is reasonable. People can always just use type = "chroot"
which will become a pass-through with this proposal anyway. And when they are ready, they will change the type to the correct type = "hashed"
.
Yes, I think this is reasonable. People can always just use
type = "chroot"
which will become a pass-through with this proposal anyway. And when they are ready, they will change the type to the correcttype = "hashed"
.
With normal passthrough, we won't be able to see which file is accessed from the runtime log. Perhaps we can also add policy = "allow_all_but_log" to passthrough mounts as well, to log first (read or write) access to each file?
With normal passthrough, we won't be able to see which file is accessed from the runtime log. Perhaps we can also add policy = "allow_all_but_log" to passthrough mounts as well, to log first (read or write) access to each file?
That's a very good point. Ok, yes, maybe we should add this allow_all_bug_log
to passthrough mounts as well.
Just a note before I forget it again: after this redesign is done, and the documentation https://gramine.readthedocs.io/en/latest/manifest-syntax.html is updated, we must move the syntax descriptions from under SGX syntax to common syntax.
For example, currently Encrypted Files (https://gramine.readthedocs.io/en/latest/manifest-syntax.html#encrypted-files) are kept under "SGX syntax" section. This is not correct anymore -- they are applicable to any PAL (backend), so they should be under "Common syntax" section.
Same will be with Trusted Files after the redesign.