cms icon indicating copy to clipboard operation
cms copied to clipboard

Use Isolate's /box directory and filesystem quotas

Open prandla opened this issue 4 months ago • 4 comments

  • Use /box instead of /tmp as the sandbox's home dir.
  • Remove allow_writing_only; it's impossible to enforce with the new design.
  • Remove "secure commands" mechanism, run all commands in sandbox. (I think it was mostly useful earlier to run unzip, by default with allow_writing_only it couldn't unzip properly. But none of the languages even run unzip anymore.)
  • Simplify cleanup logic a bit, instead of keeping the sandbox dir, we archive the sandbox in case of failure. Also remove the keep_sandbox flag. (TODO: write the tool that unzips an archived sandbox for debugging. It can take the archive hash as an argument and load it from the file storage automatically.)
  • Remove Job.keep_sandbox field, from what I can tell this was already unused.
  • Remove sandbox.max_file_size option: It breaks dotnet, and it doesn't enforce any cap on the total size of /box.
  • And finally, make CMS set isolate's --quota.

Creating automated tests to verify the quota behavior was nontrivial: setting up a quota-enabled filesystem is actually somewhat tricky (or at least poorly documented). Initially I tested locally with a tmpfs as the box_root; this required hacking isolate to use quotactl_fd instead of quotactl. (the former is a much nicer interface, but it's undocumented. I need to make a bug report to man-pages to get it documented :) )

~~Eventually I figured out that with an ext4 filesystem on a loop device, it's also possible to use the old quotactl, and thus the current release of isolate, when running tests.~~ We apparently need tmpfs quotas on github actions. New Isolate release uses quotactl_fd exclusively.

Also, since the quota setting is completely global, I had to set it to a somewhat high value (or well, I don't actually know how big is required, but since it's also used during compilation i set it to 64MB). And then I had to make the loop filesystem quite large also...

I also wrote some documentation about quotas in the sample config file (should this be moved to the documentation instead?). The conditions for how to configure the filesystem for quota support were found by pure trial and error :), but during it I discovered some combinations where isolate runs successfully with the --quota setting, but the quota is silently not enforced. This seems a bit scary, perhaps we should explicitly warn against this...

Closes #1242, closes #1005, closes #916, closes #309, maybe a few more :)

prandla avatar Aug 26 '25 07:08 prandla

Oh, i just noticed the error in the CI logs (which was actually there earlier too, i just missed it...)

"mount: /var/lib/isolate: mount(2) system call failed: No such process."

I have no idea what this means. It works on my machine, and I don't have any ubuntu machines to test on...

prandla avatar Aug 26 '25 08:08 prandla

well great. github actions is running kernel 6.11.0-1018-azure, which simply does not have quota support compiled in (neither built-in nor as a kernel module). (this wasn't easy to even verify because they also disabled /proc/config.gz.)

So, time to go spelunking for a kernel that supports it i guess???

prandla avatar Aug 26 '25 17:08 prandla

Okay, further experimentation shows that while there is no way to get a more suitable kernel, there is a way to use quotas: the lack of quota support i mentioned earlier turned out to be a lack of support for the ext4 quota file format. tmpfs quotas are supported though.

This still means we need a new isolate release though, because the only way to access tmpfs quotas is via quotactl_fd. I hacked it together here: https://github.com/prandla/isolate/commit/8c4f994061d5190862a8e6131893dab290f7bac2 but you can do a nicer implementation that checks for #ifdef SYS_quotactl_fd if you want.

prandla avatar Aug 26 '25 17:08 prandla

Rewrote the description of quotas in cms.sample.toml a bit. (Is that too big of a wall of text? Should it maybe be moved to the documentation instead?) Also removed max_file_size (see commit message).

Getting tests to pass on Github Actions is still blocked on the Isolate PR linked above. All tests pass on my machine.

prandla avatar Aug 31 '25 08:08 prandla