Instrumented path based operations using hooks defined in `Checker`
Instrumented path based operations using hooks defined in Checker.
trait Checker {
def onRead(path: ReadablePath): Unit
def onWrite(path: Path): Unit
}
Exceptions
The following operations were not instrumented:
followLink,readLinklist,walkexists,isLink,isFile,isDir- read operations for permissions/stats
watch
Future work
- A more comprehensive design would add hooks for each core operation. This would eliminate the special check handling in operations like
moveandsymlink. - As such, the methods of
ReadablePathrepresent escape hatches. These cannot be "plugged" without breaking binary compatibility.
This resolves part 1 of mill #3746.
@lihaoyi Please review.
Thanks @ajaychandran, I think this looks good. Could you add Scaladoc to the checker API as well? Then I can merge this and cut an unstable -M1 so you can start using it in Mill
The following operations were not instrumented:
Let's instrument these as well. For the purposes of Mill's (and probably others), we also want to block code from making decisions based on filesystem metadata outside their designated sandboxes, not just file contents
A more comprehensive design would add hooks for each core operation. This would eliminate the special check handling in operations like move and symlink.
@lihaoyi Any thoughts on this?
I'd rather not add special handling in the operation implementations.
The number of core operations is not that large. We could even pass in additional parameters like createFolders, followLinks to have the full context available.
Could you add Scaladoc to the checker API as well?
I had removed Scaladoc from the PR description. Do we need more than this?
/**
* Defines hooks for path based operations.
*
* This, in conjunction with [[checker]], can be used to implement custom checks like
* - restricting operations to some path(s)
* - logging operations
*/
trait Checker {
/** A hook for a read operation on `path`. */
def onRead(path: ReadablePath): Unit
/** A hook for a write operation on `path`. */
def onWrite(path: Path): Unit
}```
@ajaychandran I'm not sure what you mean by special check handling. Could you elaborate?
- For
move, I would expect it to callonWriteto both source and destination paths, since it is modifying both. - For
symlink, I would only expect it to callonWritefor the path of the created link, and do nothing for the path the link is pointing to (since we are neither reading anything from it nor modifying it)
I think for the purposes of this ticket, let's stick with a simple onRead and onWrite API. That's what we'll need in Mill. We can flesh it out more in future releases as necessary
@lihaoyi The current implementation assumes an equivalence between write and create/remove operations. I am not sure if this would fit all use cases.
- The modified implementation for
movechecks for write access on the source parent directory. I gather this is not per your expectation. - I assumed a
symlinked path could be used as argument to read. If so, a user could gain access to any file. - Should
makeDirtest write access on an existing parent folder whencreateFoldersis true?
Adding a hook for each core operation lets os-lib remain unopinionated.
Maybe rename Checker to Hook?
Good point on the symlinks and enclosing folders. In the end this is not going to be an airtight sandbox, so it's a judgement call exactly where to draw the line.
-
For
move, let's callonWriteit on the full source and destination paths -
I think for creating enclosing folders, let's just punt on that and say it's up the the implementer of
def onWrite -
For symlinks, I think what you say is reasonable, so lets call
onWriteon the destination path, since we can compute it in-memory by resolving any relative path without needing to perform any filesystem operations
I'm a bit reluctant to try and match the entire OS-Lib API in the Checker interface. Let's go with the more minimal ad-hoc interface for now.
For the purposes of this pull request, could you add an @experimental annotation, apply it to the stuff you added, and add it to the def mimaExcludeAnnotations in the build file? Just to make it clear that this API isn't stable, and we're only publishing it because we need to experiment with it in Mill
@ajaychandran not everything needs to be experimental, only the public APIs you added: os.checker and os.Checker I think
@lihaoyi This is ready for another review.
Updated hardlink as well.
@ajaychandran code changes look good. Last request here is to move all the checker tests into one separate test suite, rather than having them mixed in with all the non-checker tests. That will make it easier to skim them and audit them as necessary.
After that i will merge this and cut a milestone release for experimentation in Mill
@lihaoyi Checker tests have been moved.
Some compound operations like zip incur the cost of double-checking.
One solution would be to define a private[os] def unchecked variant, for each operation, for internal use. This can be deferred until the feature is stable.
@lihaoyi Checker tests have been moved.
Some compound operations like
zipincur the cost of double-checking. One solution would be to define aprivate[os] def uncheckedvariant, for each operation, for internal use. This can be deferred until the feature is stable.
Another option is to delegate some sort of caching solution to the implementor of Checker.
@ajaychandran pushed a tag for 0.11.4-M1 https://github.com/com-lihaoyi/os-lib/actions/runs/11537940922 hopefully it goes out soon and you can try it out in Mill. If you're going to work on the downstream tasks, we can hold off on payment for now to save on transfer fees. If not I can transfer the first portion of the bounty using the details we used before
@lihaoyi Thanks. Lets hold off payment.