tigerbeetle icon indicating copy to clipboard operation
tigerbeetle copied to clipboard

Add error handling for limits of byte-oriented command line args

Open brson opened this issue 1 year ago • 2 comments

Some command line arguments have error handling for cases where the arguments are outside of reasonable limits. Several do not. Here are some that I see offhand:

  • --cache-grid - Extremely large values hit an intCast assertion. Unreasonably large values are accepted (1000000GB).
  • --cache-accounts - small values cause "unreachable" assertions, large values intCast
  • --cache-transfers - ditto

brson avatar Jan 25 '24 00:01 brson

Note that some of the existing error handling occurs in main.zig, some in cli.zig. It seems reasonable to decide on one place.

brson avatar Jan 25 '24 00:01 brson

Note that some of the existing error handling occurs in main.zig, some in cli.zig. It seems reasonable to decide on one place.

error handling (printing a nice message and exiting the process) should occur in cli.zig.

However, it’s generally a good idea to assert in main (or even at the deepest layer where the value of a particular argument penetrates), that the value is reasonable, as a defense in depth.

matklad avatar Jan 25 '24 00:01 matklad

https://github.com/tigerbeetle/tigerbeetle/pull/2122 will fix the overflows.

I am unsure what's the right way to handle things like --cache-grid=2TiB case --- there's no static upper bound we can use here, whether this is reasonable entirely depends on the box you are running this one.

At the same time, I feel like we should handle this: it feels like a reasonable failure mode to accidentaly type an extra zero when sizing the cache, or to type one zero less when speccing a VM to run tigerbeetle.

Right now, the three failure modes possible are:

  • allocation failure during startup (good!)
  • some process (not necessary TB) gets killed by OOM (bad!)
  • the computer starts swapping (bad!)

I think at startup we should ask how much ram total do we have, and sanity check that the sum of sizes of all caches doesn't exceed that amount. It seems like the way to do that would be to parse /proc/meminfo?

matklad avatar Jul 22 '24 12:07 matklad

I think at startup we should ask how much ram total do we have, and sanity check that the sum of sizes of all caches doesn't exceed that amount. It seems like the way to do that would be to parse /proc/meminfo?

That could work, just have to be done in a portable way (ie get similar info on Windows and MacOS).

Not sure if that's a common pattern already in the codebase (ie having platform specific checks/logic)

dadepo avatar Jul 22 '24 18:07 dadepo