RFC: `multipass config/get/set` structure and user experience
#307 has some details on specifics of the "custom repository" feature, but I wanted to collect the user experience and the imaginable extent of the configuration structure here.
The CLI
Three new commands would be introduced, all operating on YAML-formatted data, with . (periods) separating depth levels. All operations need to be atomic and return validation errors.
multipass config [--no-defaults] [--expand] [<subtree>]
# opens an editor in interactive mode
# or prints the YAML of the subtree to standard out in non-interactive mode
# if `--no-defaults` given, only explicitly configured values will be shown
# some parts of the tree may not be shown by default (think remote and instance configuration), passing `--expand` would mean that the whole tree is shown
# on errors returns to the editor with annotation about validation or configuration issues
multipass get <key>
# prints the value of the requested configuration key
multipass set [<key>=<value> …]
# sets the values of the given configuration keys
# or accepts a YAML of a subset of the configuration tree on standard input
The structure
To avoid unnecessary nesting, I propose top-level keys to only be: client and remote names, with remote configuration nested under the remote name.
client:
default-remote: local # name of one of the remotes configured below
primary-name: primary # name of the primary instance
launch-defaults:
image: default # see `multipass find` for available images or use file:// or http:// URLs
cpus: 1 # recommended at most one below your host's cores
disk: 5GB # this is the maximum the instance can use, so go big
memory: 2GB # maximum the instance can use, shared between host and instances
cloud-init: {} # see http://cloudinit.readthedocs.io/en/latest/topics/examples.html
mounts:
/local/path: remote/path
/other/path:
target: other/remote/path
uid_maps:
*: default # "*" for "all", "default" for the default user's UID inside the instance,…
gid_maps:
*: default # …or the numeric ID
images:
example: # this is the image name to use for `multipass launch`
aliases: [ex] # an optional list of alternative names
image: default # defaults to this image's key above
cpus: 2 # overrides the default launch options from above
lts: # because the `lts` alias exists on the remote…
disk: 25GB # …this only overrides the disk size when doing `multipass launch lts`
local: # configured on first start
address: unix:/run/multipass_socket # platform default
# below is remote configuration
listen-address: unix:/run/multipass_socket
driver: qemu # one of qemu, hyperkit, hyper-v, libvirt - remote platform default
network: 10.0.6.0/24 # needs extending for bridging and IPv6
proxy:
http: proxy://address # will be used in the instances unless overridden with `--cloud-init`
https: proxy://address
default-stream: release # when using this remote, this will be the default stream used to find images
streams: # base URLs to image remotes
release: https://cloud-images.ubuntu.com/releases # may need to be expanded when v3 streams exist
daily: https://cloud-images.ubuntu.com/daily
minimal: https://cloud-images.ubuntu.com/minimal/releases
images: # this has the same format as `client.images` above, with lower precedence
core:
aliases: [core16]
image: http://cdimage.ubuntu.com/ubuntu-core/16/stable/current/ubuntu-core-16-amd64.img.xz
core18:
aliases: [core18]
image: http://cdimage.ubuntu.com/ubuntu-core/18/stable/current/ubuntu-core-18-amd64.img.xz
snapcraft:core16:
description: "Snapcraft builder for Core 16"
aliases: [snapcraft:core]
image: http://cloud-images.ubuntu.com/minimal/releases/xenial/release/ubuntu-16.04-minimal-cloudimg-amd64-disk1.img
kernel: http://cloud-images.ubuntu.com/releases/xenial/release/unpacked/ubuntu-16.04-server-cloudimg-amd64-vmlinuz-generic
initrd: http://cloud-images.ubuntu.com/releases/xenial/release/unpacked/ubuntu-16.04-server-cloudimg-amd64-initrd-generic
snapcraft:core18:
description: "Snapcraft builder for Core 18"
image: http://cloud-images.ubuntu.com/minimal/releases/bionic/release/ubuntu-18.04-minimal-cloudimg-amd64-disk1.img
kernel: http://cloud-images.ubuntu.com/releases/bionic/release/unpacked/ubuntu-18.04-server-cloudimg-amd64-vmlinuz-generic
initrd: http://cloud-images.ubuntu.com/releases/bionic/release/unpacked/ubuntu-18.04-server-cloudimg-amd64-initrd-generic
# below are configured instance definitions
instance-name: # this is a configured instance on the `local` remote
name: ~ # this allows renaming the instance
cpus: 4 # this changes the number of CPUs configured for the instance
memory: 2GB # this changes the amount of memory available to the instance
gpus: 1 # give the instance a single random available GPU
other-remote:
address: 1.2.3.4:1234 # connecting to a remote might require providing passphrase
listen-address: 0.0.0.0:1234
driver: hyperkit # platform default, changing may not be possible
instance-name: # this is a configured instance on the `local` remote
cpus: 4 # this changes the number of CPUs configured for the instance
memory: 2GB # this changes the amount of memory available to the instance
gpus: # give the instance the listed GPUs
- 0:1:1 # GPU identifier, format platform-dependent
- 0:1:2
That's the extent of configuration that I can come up with right now, please have a look through for errors, missing bits or plain stupidity. ;)
Doubts:
mountswould be stricter if it wastarget_path: source_pathto ensure key (and hence, mount target) uniqueness, and allow the same source mounted in multiple places; but could be confusing- we could treat
streamsthe same way LXD does, and make them remotes (of different types) - putting instances top-level under remotes means name clashes with configuration keys; putting them under
remote.instances.instancemeans more typing, unless we maybe defineremote:instanceas a shortcut for it; along withdefault-remotethis could arguably be shortened tomultipass config name, which gets expanded to$default-remote:instanceifnamenot found, and then to$default-remote.instances.instance - again taking LXD for inspiration, they have streams support both in the client and in the daemon (see their API for launching containers), question is how that idea affects the above configuration
- what does removing a configuration entry mean?
- no-op or reset to default
- YAML's
null(~) could be used to reset instead
I think this is quite a big deal, I am still trying to wrap my head around it, so sorry if comments and question below sound confused or contradictory.
Overall, it is nice to have a long view of where to go, but perhaps it would be more manageable to split things into elementary steps (identifying dependencies/precedences early and then considering each one in detail).
Perhaps we don't need to resolve the end-goal in detail for now. In particular, the exact final schema is less of an issue to begin with IMO. Personally my concerns go more toward how the feature would work (whatever the exact config contents). Here are some questions and things to think about:
- general symbolic representation/structure
- [x] this seems to be settled as some form of property tree
- persisted format
- [x] yaml, also looks consensual
- [ ] how many files really? full copies or each entity with its own file and its own config?
- format in memory - do we...
- [x] disperse information over attributes of program entities (e.g. daemon.backend = libvirt, foo.bar.doesBlah = true)?
- [ ] or keep information in a centralized DB that all entities refer back to?
- [ ] keep yaml objects in mem and keep referring back to them?
- [ ] transform it to some other intermediate format?
- [ ] custom? (like current DaemonConfig)
- [ ] general? (e.g. boost ptrees)?
- [ ] how does this affect type-checking and validation?
- Instances are currently persistified by the daemon, so mixing in instances with the config means the daemons are not only consumers but also producers
- [ ] do we change that premise?
- [ ] I suppose the clients would still be responsible for persistifying their config? Do they all write to the same file? So we need file locking?
- How do we drive "online" daemon updates (while running)?
- [ ] when using the client to change the config, I suppose we could just notify the appropriate "remotes". But what if the backend file is edited directly? do daemons poll in some sort of loop? careful with impacts on thread-safety and atomicity...
- at what point in the chain of consuming a yaml do we approach validation?
- [x] ideally ASAP upon reading (I suppose) but we may not have all necessary info to judge at that point. Some things may depend on state or other information external to configuration and reading entity (e.g. that port is already taken; that gpu config is invalid in this system...);
- [x] I suppose the client would have to wait for confirmation from the daemon(s)?
- the configuration needs to be conveyed over the network, but still written to locally
- [ ] again, how is everything kept in sync? which is the authoritative version?
- [ ] how do daemons inform clients of changed configs? do they initiate communications? inform them on next contact? have to keep track of what clients know what?
- So we want multiple remotes... We know we can also have multiple clients...
- [ ] so we have a distributed n-to-n relationship? and we need to implement a (small-scale) file-backed DB that needs to be replicated and atomically read and written by every intervenient? Might we better off using some existing NoSQL solution rather than implementing it all ourselves?
- concerning the config cli, it is different from both lxc, git... Is it modeled after something else? Otherwise wouldn't it be better to stick to something existing?
Perhaps we don't need to resolve the end-goal in detail for now. In particular, the exact final schema is less of an issue to begin with IMO. Personally my concerns go more toward how the feature would work (whatever the exact config contents). Here are some questions and things to think about:
Sure, this was a brain-dump for determining the user experience of all this - for that we need to look ahead so we don't step in the wrong direction.
I wanted this issue to only deal with how the data is presented to the user in a CLI environment, I don't put any requirements on how it is handled internally by either the client or daemon. I have omitted your comments that deal with that.
- general symbolic representation/structure
- [x] this seems to be settled as some form of property tree
Yes, that is my proposal and the direction a lot of our projects are going.
- Instances are currently persistified by the daemon, so mixing in instances with the config means the daemons are not only consumers but also producers
Not sure what you mean here. Adding an instance entry in the YAML above would be an error (instance not found), removing it would be a no-op on the instance configuration.
- at what point in the chain of consuming a yaml do we approach validation?
Each entity should be responsible for validating their own piece of the configuration (the client - the client. subtree, each remote their own).
- the configuration needs to be conveyed over the network, but still written to locally
The YAML is just a presentation format to the user. To show a remote's configuration, the client has to ask the remote to provide it with its current config. No remote config is persisted by the client, it is just sent over the wire.
- concerning the config cli, it is different from both lxc, git... Is it modeled after something else? Otherwise wouldn't it be better to stick to something existing?
It is definitely inspired by LXD, but early on we decided to not have nested commands (lxc config device edit…) to keep our CLI clean. What kind of changes would you suggest?
Thanks for the reply. OK, perhaps I missed the scope here. I thought this was +/- go for implementation and felt grasping for the architecture part. Anyway, you already clarified some things.
Yes, that is my proposal and the direction a lot of our projects are going.
Sure, sounds like the way to go.
Not sure what you mean here [...]
Oh, I was also under the impression that this config would end up replacing the current image and instance DBs. That would include state info not necessarily driven by the client, which is why I mentioned the daemon as producer. If this is only strict config, that makes it simpler (which often implies better :wink:)
Each entity should be responsible for validating their own piece of the configuration [...]
OK, I was also thinking about each entity internally. Immediate validation means the effects of a multipass set need to be synchronous. It can't be just writing info somewhere that objects later read as needed, so this impacts my point 3 above (info would have to be propagated).
[...] No remote config is persisted by the client [...]
OK, so we're separating client/remote configs, whatever the backend. I wasn't sure you wanted that.
It is definitely inspired by LXD [...] What kind of changes would you suggest?
git config is quite similar but nesting (only) the commands (not the contents). So getting, setting, unsetting, all are achieved with git config (+ appropriate option). And there is no '=' between key and value. If we followed that, we could do
multipass config <key>to get a yaml treemultipass config <key> <val>to setmultipass config --add <key> <val>to addmultipass config --get <key>to get, etc.
This would make it clear we were talking about configuration (plain get/set could be less clear). It would allow grouping config documentation under a single command and it would probably be more familiar.
git configis quite similar but nesting (only) the commands (not the contents). So getting, setting, unsetting, all are achieved withgit config(+ appropriate option). And there is no '=' between key and value. If we followed that, we could do* `multipass config <key>` to get a yaml tree * `multipass config <key> <val>` to set * `multipass config --add <key> <val>` to add * `multipass config --get <key>` to get, etc.
The --add / --get seem to me like nested commands even if they pretend not to be… What happens if you multipass config --add <key> <val> --get <key> --add <key> <val>? Suddenly options actually become positional arguments…
This would make it clear we were talking about configuration (plain get/set could be less clear). It would allow grouping config documentation under a single command and it would probably be more familiar.
Maybe we can still do with config alone…
multipass config [<subtree|key> …]to get / edit a yaml tree or a value if a single full key was givenmultipass config <key>=<val> [<key>=<val> …]to set multiple values at the same time
The --add / --get seem to me like nested commands even if they pretend not to be
They are, that is why I said nesting (only) the commands (not the contents). I thought it could be worth considering.
What happens if you multipass config --add
--get --add
They would be mutually exclusive (see git help config).
Maybe we can still do with
configalone…
Hmm, I prefer the original then. To be fair, I meanwhile noticed get/set would be consistent with snap.
@Saviq: how do you feel about replacing client.primary-name with client.petenv-name? I would favor it, given that "petenv" is already spread throughout code ids, while "primary" is isolated as the current value of the petenv_name constant.
@ricab it's for presentation to the user, we'd have to explain to the user what it means, and why do we call it primary there but petenv here.
Right, good point.
We've been iterating on how could an "integrate with $app" configuration work and I'd like to document what we're thinking:
client:
apps:
windows-terminal:
profiles: primary # or "none", "all"
windows-terminal-private: # Windows Terminal with a custom settings path
type: windows-terminal
settings-file: C:\some\path\settings.json
profiles: all
iterm2:
profiles: primary
lxc:
remotes: all
prefix: mp-
lxc-beta: # a parallel install of LXD from the `beta` channel
remotes: all
command: snap run lxd_beta.lxc
gui:
terminal-app: windows-terminal-private # pointer at one of client.apps above
Each app could have a distinct set of properties that can be extended over time. The type would default to the key in the client.apps map, and would route the settings below to the right integration points.
From further discussion on IRC, it would be nice to support a general terminal, where the user could specify the executable/command, and a linux-terminal, where the user could specify a terminal that was known to the system via a desktop file.
From further discussion on IRC, it would be nice to support a general
terminal, where the user could specify the executable/command, and alinux-terminal, where the user could specify a terminal that was known to the system via a desktop file.
Sure, the above is not meant to be exhaustive :)
Sure, the above is not meant to be exhaustive :)
I know, just wanted to record it for the future :slightly_smiling_face:
So here's a case that I don't think we've fully considered:
$ multipass launch --name driver Launched: driver $ multipass get --keys client.gui.autostart client.gui.hotkey client.primary-name local.bridged-network local.driver local.driver.cpus local.driver.disk local.driver.memory
I think we need to plan for <remote>.instances.<instance-name>.* to be the disambiguated key (e.g. for multipass config), with local.<instance-name>.* being a shorthand, where not ambiguous (e.g. multipass get/set).
This issue is a relic. We still take it as a compass when doing something regarding the configuration and have slowly been pursuing parts of it. I think we should keep it open and mark it as an epic.