Serge Bazanski
Serge Bazanski
We currently hardcode the assumption of the system partition size in our installer code. The picked value there is future-proof, for Metropolis, but not flexible enough to support other payloads....
Some production machines we're dealing with seem to be very confused by the existence of two EFI System Partitions - the one stock on the machine and the one we...
We need some RAID (1?) support for the data partition in Metropolis nodes. @lorenz has some opinions on what to do here
We've never really tested Control Plane HA in an end-to-end scenario, and we reaped the effects of that during our first large production deploying. Related fixes: 1. https://review.monogon.dev/c/monogon/+/2067 2. https://review.monogon.dev/c/monogon/+/2068...
We want a CLI to talk to a BMDB, first to at least query per-machine state, and export that state to some somewhat machine-parsable format. This is expected to be...
`man 7 kernel_lockdown` This is tangentially related to enabling Secure Boot, but we should do it as early as possible - even if we don't sign things and have module...
What we want is the usual suspects: 1. Latency 2. Error rate 3. Request rate This will get us to detect more serious conditions like #276 without catching _that_ particular...
We need to expose the current Lumen web panel consumer as a gRPC API for use by the BMaaS control plane. This involves: 1. Designing the API, probably making it...
Some Equinix machines we provision (currently around 1-10%) get stuck in Failed per the Equinix API after we attempt to provision them. The shepherd should probably notice this and nuke...
The system should be aware whether a machine is considered 'visible' to the user, and when such a tag is present, take precautions to not run disruptive / lossy processes....