proot
proot copied to clipboard
General discussion - Improving PRoot's performance : caching lstat calls
Hi everyone,
I'm opening this kind of thread to discuss some points of interest related to performance in PRoot. The features I would like to talk about here might be difficult to integrate in the existing codebase (or maybe not), but will always be useful for newer implementations of PRoot if they are thought about from the beginning (later on this in another thread).
This issue will be about limiting disk accesses.
Caching lstat
calls
Why
When translating a path, PRoot does some checks on it to see it's a valid path, dereference symlinks, and obtain its permissions. For instance, just with this command
proot /home/user/my_program
PRoot will have to canonicalize the path, which implies checking that each intermediary path is valid,
so a call per word:
lstat("/home/")
lstat("/home/user")
lstat("/home/user/my_program")
Now, in the case of large IO manipulations involving copying a large amount of files, many of these lstat
will actually be the same (if files have same folder ancestors).
And the worse thing is, the longer a path is, the slower it gets.
If you have a file tree of n
nested folders, and the last one has m
files,
the number of lstat
calls will be in O(n * m)
.
The idea is then to cache lstat
calls, to avoid as much disk seeks as possible (see here for latency numbers). Applied to our previous calculation, caching the calls to the nested folders, it would reduce the numbers of lstat
calls to O(n + m)
instead.
This idea already appears in the roadmap, but wasn't implemented.
How
That where we're going to have to discuss, as any solution that follows will have vulnerabilities that I would have missed.
-
HashMap<Path, Metadata>
Eachlstat
call is cached, using the (canonicalized) path itself as the key. The issue there is cache invalidation: if a parent folder is deleted, we don't have an easy way to clear the cache of all the subfiles and subfolders, or this would necessitate another data structure to remember the full filesystem tree. -
Trie<Path, Metadata>
This one is like the hash map, but includes a tree structure that would be useful for easily invalidating a folder's cache and its children. If we detect a delete or rename operation on a node of the trie, we clear the cache of all the related branches.The fun part comes from the mess that this trie can become when we include outside-of-the-guest-root bindings, glue, symlinks, mounts... But it shouldn't be impossible,
And of course, this feature will probably be disabled by default.
Issues
What if another non-ptraced process deletes one of the folders? PRoot can't detect any file manipulation outside of the ptraced-process.
Questions
- The roadmap mentions that the syscall is slow
on some filesystems (like NFS)
. Does it mean that there are already some sort of cache forlstat
or is negligeable on most filesystems? Will cachinglstat
really improve performances, or is it actually not that important, and the slowness comes from somewhere else? - Is the supposition that only the prooted-process will have access to its guest filesystem (or at least do write operations on it), enough to justify the utility of such a feature? So, to sum up, just ignore the issue?
- Any idea on how to solve this issue in a general case?
I use the proot -b xx:xx for a gui application, and I also found that the PRoot's performance is a little low. In my case, proot always used high up to 25% of cpu. 1% - 5% cpu usage will be acceptable, but 25% cpu usage is too high.
Here is the result of how the proot use the resource coming from callgrind in valgrind. It shows translate_syscall used up to 47% resource. I just run proot without any option. So it means that, without any option, the proot always try to translate the syscall and canonicalize the path, even when there is no need to canonicalize path but just copy the path. This behavior slow down the performance when people use the proot with only -b option. As the roadmap says, one of the core function of proot is path translation. It means a lot to make the binding function works better with only -b option.