gramine
gramine copied to clipboard
process creation implemention
When creating a new process, a fresh enclave is established, and all memory from the parent process is copied in bulk to the child process. Due to the large volume of data, the creation of child processes is highly time-consuming (taking approximately 30 seconds to several minutes or even longer). We propose an optimization: track modifications to each page within the process. During subsequent serialization and transfer operations, only the modified ones need to be serialized and transferred, while unmodified pages can be omitted from transmission to the child process.
Certainly, "memory modification tracking" can be implemented using write protection, which is very efficient when handled in the write protection fault handler.
Yes with Gramine we do checkpoint and full restore for fork. Not only we need to create a brand-new process, we also need to establish a secure channel and transfer checkpointed memory over encrypted channel. I see that optimization you are proposing is similar to dirty page tracking for VM migration and will be good to explore as an optimization instead of full restore. Is this something you are interested in implementing?
There is another issue in this space that proposes to optimize the child process creation step https://github.com/gramineproject/gramine/issues/430
write protection, which is very efficient when handled in the write protection fault handler.
It definitely isn't, remember we aren't in ring0, and moreover, we need to send the memory to a remote process.
We propose an optimization: track modifications to each page within the process. During subsequent serialization and transfer operations, only the modified ones need to be serialized and transferred, while unmodified pages can be omitted from transmission to the child process.
Consider this scenario:
- Process A forks into A and B.
- Process A allocates new memory block X.
- Process A forks into A and C.
- Process A changes memory in the block X.
- Process A changes memory in some other block than X.
What exactly do you propose should happen in steps 4 and 5?