team
team copied to clipboard
Config file management
(moderated summary by WG)
Context
Plan of Action
In-ecosystem resources
- Common directories
- Read/write config
- keyring for storing passwords
External inspiration
Challenges
- Cross platform paths in config files. See also #10
- Integrating information from environment, command line arguments, defaults, and cascading config files
- Every existing applications is probably littered with special cases, migration schemes and special rules how settings are handled. I'm not sure it is possible to encode all this into an API without making it extremely hard to use.
- Good error messages, see https://github.com/serde-rs/serde/issues/1184
@spacekookie's original poist
In the first meeting, we discussed the task of file management on different platforms (particularly configurations) and how it can be made better.
@Eijebong summarised it like this
If I need a config file, I don't want to know that it should be
${XDG_CONFIG:${XDG_HOME}/.config:/home/${user}/.configon linux,%AppDir%/App/on windows and something else on osx [...]
There is a crate for "determing system configuration" (app-dirs-rs) but it seems unmaintained and not up to date
The configure crate by @withoutboats abstract about configuration in general (as the name suggests), but doesn't seem to have an adapter for configuration files right now (except for Cargo.toml, but that's not the use case we have in mind I guess).
I think it'd good to lay out some requirements for a crate(s) that would fill this gap:
- Accepts a file name to either load/save
- Abstracts away platform specific paths (i.e. as the application developer I only want to say, "myapp.conf" and have the crate handle where and how "myapp.conf" gets loaded/saved
I'd be OK with having multiple sub crates for various platforms, and then a "parent" crate that allows abstracting over the platform where all I do is specify a file name I want to load/save.
Bonus if the crate can allow me to search custom directories, or tweak the order (on platforms where applicable).
but doesn't seem to have an adapter for configuration files right now (except for Cargo.toml, but that's not the use case we have in mind I guess).
Nope, the configure crate's default "source" is definitely designed for use cases where the person configuring the application is also the author - such as network services. However, the intent is for libraries to use configure, so that the application author can have total control over the source of configuration.
A configuration source that integrates with configure and is designed for CLIs would be a great addition, and possibly one I'd be interested in upstreaming into configure proper.
I'm thinking this issue might be part of a bigger topic.
- How do we store user-created configuration files cross-platform? (So far this issue has only touched on this one specifically. E.g. Linux
$HOME/.config) - Where do we store temporary files cross-platform? (Does not need to persist between boots, e.g. Linux
/tmp.). - How do we store essential user data files cross-platform? (Stuff like login tokens, usually generated by applications. E.g.
$HOME/.localon Linux.) - How do we store non-essential data files cross-platform? (Needs to persist between boots, e.g. Linux
$HOME/.cache.)
It's probably uncommon for a single application to use all of these, but one or more should be common enough. It feels like these questions are part of the same problem; perhaps it might be useful to consider all of these questions as part of this discussion?
For the question of temporary files there is already a crate which seems to do its job quite nicely (though I've only used it in limited scenarios so far, maybe it can be improved!)
As for the rest…I think it would be pretty cool if we could create (or find and improve existing) crates that mirror the same behaviour for other configuration, essential and non-essential data files as well.
It should be as simple as saying Configdir::new("my_app_name") and being able to write and read configurations from it.
Edit Just as I hit "Comment" I found this crate here
Hi, @soc! The Rust CLI working group is talking about cross-platform configuration file management and your directories crate has come up. Looking at your Github profile, I see you have a Java directories package as well, so you seem have some expertise in this area. Wanna chime in here? :)
@killercup Sure, how can I help?
@soc awesome! We were currently doing some research about the status quo of crates that are useful when writing CLI tools, work cross-platform and are maintained. For example, we want to come up with a good story around how to easily configure a CLI tool—with config files, env vars, and CLI flags. This issue is focussing on the handling of config files. @kbknapp already listed some good requirements in https://github.com/rust-lang-nursery/cli-wg/issues/7#issuecomment-367085114.
Do you think directories is a good foundation here? What are your plans for it? Can we help you get it to 1.0? :)
(@spacekookie and @yoshuawuyts probably have more to say!)
For example, we want to come up with a good story around how to easily configure a CLI tool—with config files, env vars, and CLI flags.
directories is intentionally focused solely on dealing with operating system defaults. The reasoning for this is not because I believe that other venues for configuration are not important, but to provide the most minimal, focused and stable API I can get away with.
For instance, when dealing with CLI flags, the first issue you have is that of style (-h and --help vs. -help; -xyz vs -x, -y, -z; key=value vs. key value; and that's just Linux/macOS ... Windows has its own, different rules with /h etc.).
There is potentially a lot of complexity and moving parts involved when trying to provide an CLI interface that makes everyone happy.
Do you think directories is a good foundation here?
I do think that directories is a good foundation for dealing with the operating system standards part of your goals.
I believe that dealing with CLI flags should probably be done in a separate library, or in a way more specific to the individual application's needs, because dealing with CLI flags is very application-specific.
In the end individual applications already need to have some custom code anyway to deal with migrating from storing their data directly in $HOME to following the platform standards. Dealing with CLI flags will probably be the same.
That's why directories only tells developers which directories they should be using, but does not get involved with creating directories itself, or making decisions about the priority of multiple directories (for instance platform defaults vs. CLI flags vs. config files).
Application-specific code will be required to handle such issues, and I want directories to avoid getting involved in that: Often the cost of complexity to solve such issues in a general fashion in a library is way higher than dealing with it on the application side, especially when handling (legacy) applications with their own folder in $HOME – without breaking things for existing users.
Here is an example of an application that makes use of directories (the JVM version) and deals with migration compatibility, property files, and application-specific env vars: https://github.com/coursier/coursier/pull/676.
What are your plans for it? Can we help you get it to 1.0? :)
My plan is to declare it as stable as fast as possible. I think the main blockers are
- having more people use and test it, to make sure it works
- a thorough review of the decisions made concerning the various paths chosen (by someone who isn't me): https://github.com/soc/directories-rs/issues/2
- a review of the Windows-specific code: https://github.com/soc/directories-rs/issues/1
I have created tickets for the remaining issues I mentioned: https://github.com/soc/directories-rs/issues/1 and https://github.com/soc/directories-rs/issues/2.
A more general note: There is a vast difference between selecting and standardizing on crates that provide certain functionality (like CLI parsing, config file parsing) and having one standardized way of handling application configuration:
With the former you probably get crates that do almost everything and allow configuration of almost everything.
With the latter, you want to be highly selective and make actual choices how things can be specified, and not allow a free for all in terms of decisions a developer can make.
As you've noticed, I've opened some issues at directories-rs. I'd hold off on releasing a 1.0 before there are some consumers of the crate.
There is a vast difference between selecting and standardizing on crates that provide certain functionality (like CLI parsing, config file parsing) and having one standardized way of handling application configuration
Absolutely. We already have some great libraries for CLI args, and I'd love to have an equally as good story for dealing with config files. That is not one crate – it's several build on top of and complementing each other :)
(We'll hopefully see more concrete proposals for this in #6!)
I think the focus should be less on a config file format and more on an API to get to those files. As a developer I might still want to be able to chose a format, say json or toml or ini via whatever serde backend exists to read/ write my configuration files. But I don't want to have to worry about where to put it.
Not sure why you brought up CLI parsing. Although thinking about it now, I'm not sure how clap.rs handles windows arguments :sweat_smile:
I haven't had a chance to play around with your crate yet but from the README it looks like it already exposes pretty much all the directory paths we might be interested in. At that point it becomes a question of making the API more ergonomic. i.e. maybe there could be a function to easily list configuration files for the given application (or None if there are none), etc
Not sure why you brought up CLI parsing.
I brought it up, sorry :)
So, I've been thinking about what an all-around config solution might look like. We should not implement such a thing right now, but discuss what needs to happen to get there!
Here's a small proposal that integrates ideas from clap (v3, this is future!) and configure to get the discussion going:
#[derive(Debug, Deserialize, Clap, Configure)]
#[config(prefix = "diesel")]
struct Args {
#[clap(short = "q", long = "quiet")]
quiet: bool,
#[clap(long = "database-url")]
database_url: bool,
#[clap(subcommands)]
command: DieselCliCommand, // an enum defining subcommands with their own fields and attributes
}
fn main() {
let args = Args::configure()
.read_from(configure::adaptors::config_file::toml("diesel_cli.toml")) // Invokes serde
.read_from(configure::adaptors::env_file()) // dotenv
.read_from(configure::adaptors::env()) // std::env
.read_from(configure::adaptors::clap_auto_init()); // Clap incl. early exit on `-h` and stuff like that
}
You can then:
- pass
--database-url=something.sqlite - execute the program with
env DIESEL_DATABASE_URL=something.sqlite - Have a
.envfile withDIESEL_DATABASE_URL=something.sqlite - Have a
~/.config/diesel_cli.tomlfile
Is that approximately the direction in which you want to go? What needs to happen to get there?
I think the CLI/conf/env story should be in another issue.
Sure, that was just for inspiration and to set some context. (If you have other use cases/ideas, please tell us :))
I have a couple of request in the structopt issues about that (no ideas, but persons wanting something like https://github.com/rust-lang-nursery/cli-wg/issues/7#issuecomment-367673115)
Like @spacekookie said, I think it should focus on abstracting over platform specific issues and not on the format, or providing "key->value" style API.
As the application writer, I want to just specify a file name, and let this crate handle where to store it. I then worry about formats, reading/writing, etc.
Then later on someone could write a generic crate to abstract over this configure crate, using something like serde to give a key->value style API.
Here's how I see the crate structure playing out (note, the crate names are just generic and not referring to anything existing right now).

At a former employer, I wrote a config file management library (in Python) that turned out to be popular with my fellow developers (because it was easy to add to an existing project) and with our operations staff (because all our tools worked the same way, and the configuration was flexible enough for most of our use-cases). It worked like this:
- an application would include a configuration file containing all the generic defaults, in the Python
ConfigParserformat (basically, an INI file) - the application calls into the library, passing the application name and the defaults file
- the library reads the defaults
- the library reads
/etc/xdg/$appname/config.cfg(the standard XDG system-wide config directory, if it exists) and overlays those settings on top of the defaults - the library reads
~/.config/$appname/config.cfg(the standard XDG per-user config directory, if it exists` and overlays those settings on top of the defaults - For each section and key in the combined configuration, the library would build a string like
$appname_$section_$keyand upper-case it. If an environment variable with that name existed, its value would replace the value loaded from the config files - the library returns the fully-populated configuration data to the application
Pros:
- The application only needs to call a single function.
- The defaults file can contain all the possible config settings, example values and even documentation for them
- maybe not the best possible location for such things, but still better than "scattered across the application in each bit of code that reads a config variable"
- System-wide configuration is useful for provisioning tools like Ansible/Puppet/Chef/etc. that deploy the tool to automatically configure it to work on that host (for example, setting a default HTTP proxy, or picking a geographically close rendezvous server)
- Per-user configuration means users can set things up the way they like them
- Environment variable configuration means one tool can launch another tool and force some configuration option it needs (like output format, or log file name)
- At each stage, you can override some defaults without replacing all of them
- for example, you can use Ansible/Puppet/Chef/etc. to change the "default HTTP proxy" in the system-wide config without worrying about users who have created their own config missing out
- Because the INI format is limited, overlaying configuration files is simple, with predictable results
Cons:
- It can be difficult to fit an application's configuration needs into the limited vocabulary of INI files
- With no configuration schema, the application has to do all the deserialization/validation work itself
- If a config section
foohas keybar_baz, and sectionfoo_barhas keybaz, both will be mapped to the environment variable$appname_FOO_BAR_BAZand there isn't really any way around that. - There's no good way to extend this to CLI parsing, not least because it would be impossible to generate decent
--helpoutput from the limited information we have
If I were to attempt something similar in Rust:
- I'd try to find a way to build it around a configuration schema object (following the model of
serdeandstructopt) instead of tossing around raw config files - I'd probably leave the environment-variable config the same; it's clunky but we didn't need it very often, so it wasn't a problem in practice
- I'd absolutely 100% require the ability to overlay config files for different sources, though—I don't know how that would work exactly with richer data formats than INI, but I'd have to find a way
I think the focus should be less on a config file format and more on an API to get to those files.
One reason to consider a standard config file format, or at least a standard config data model: on Windows, perhaps the standard configuration source could/should be the Registry, rather than the filesystem?
After some research on that, it seems that most developers recommend and prefer files over the registry:
- https://softwareengineering.stackexchange.com/questions/144238/ini-files-or-registry-or-personal-files
- https://blog.codinghorror.com/was-the-windows-registry-a-good-idea/
Since this thread is about the location of config files rather than their contents this may be a bit off topic, but here goes anyway:
Similar to how structopt works, I'd love to do
#[derive(Structconfig)]
pub struct Config {
timeout: u8
#[structconfig(name="retries", default=3)]
no_of_retries: u8,
files: Vec<PathBuf>,
}
and have all the config stuff taken care of for me!
Edit:
I'd try to find a way to build it around a configuration schema object (following the model of serde and structopt) instead of tossing around raw config files
Didn't see that it had already been suggested.
@Screwtapello
How did your code deal with first run, if there wasn't a config file? Did it assume you wanted to use the defaults, or did it exit and prompt you to create a config file? (or did it walk you through creating the config file interactively?)
@derekdreery
At first run, it would use the defaults. For the various tools we created, every config option always had a sensible out-of-the-box default. Things the program absolutely could not know without asking would generally be command-line arguments, not config options.
It's a big world, and I'm sure there's some potential config options that cannot possibly have a sensible default, but I can't think of one right now. If anyone has an example, I'd love to hear it.
Another aspect of config management to consider is passwords. Looks like there is a keyring crate that could use some polish and advertising.
Hey everyone. I'm the current dev lead of conda, which is a cross-platform, system-level package manager. Currently written in python--but we're in the initial stages of considering transitioning key pieces to rust.
Just wanted to add to this discussion how we do configuration, because it's been powerful and has worked out very well. It's also very similar to what @Screwtapello described.
For each invocation of our executable, we build up a configuration context object from four sources of configuration information:
- hard-coded default values
- (potentially multiple) configuration files, including support for files in "
.d" directories - environment variables
- command line flags
These are linearized in a way that the configuration sources conceptually closest to the process invocation take precedence. That is, if a configuration parameter is provided as a CLI flag, but also provided in a configuration file, the CLI-provided value would win. I guess the insight here is that most CLI applications deal with at least one configuration file, environment variables, and CLI flags anyway, and we've just realized that they all represent basically the same type of information, and can be generalized and unified.
One capability that was especially important for us to add was the ability for sysadmins to lock down configuration for the entire system in "lower-level" read-only files. As we merge the sources of configuration information, we provide a flag sort of like the css !important that lets the lower-level value be the final value.
I don't want to go into too much detail here. There's a blog post with more details, including how we deal with merging sequence and map-type configuration parameters. I did want to point all this out though as support for the usefulness of what @Screwtapello described.
As we merge the sources of configuration information, we provide a flag sort of like the css !important that lets the lower-level value be the final value.
An alternative model that achieves the same goal is to have separate "config" and "override" files:
- system-wide config
- per-user config
- environment variables and command-line flags
- per-user overrides
- per-user config
- system-wide overrides
The advantage over an !important flag is that you don't need special syntax in your config-file format (and therefore serde, etc.) while the disadvantage is that you have nearly twice as many config locations to document, and for users to check when diagnosing surprising behaviour.
the disadvantage is that you have nearly twice as many config locations to document, and for users to check when diagnosing surprising behaviour
@Screwtapello you could mitigate this by being able to generate something like the following
# Configs - some config option
1. There was no value at system-wide level. *value = default*
2. Found value *newvalue* at user level *value = newvalue*
3. There was no value at env/cli level *value = newvalue*
4. There was no value at user override level *value = newvalue*
5. Found value *newvalue2* at system override level *value = newvalue2*
6. Final value for *config option* is *newvalue2*
One option would be to have an API like
Config::from(system_overrides, commandline, environment, config_file, legacy_config_file, system)
Where people describe the order of settings they want to have and the library resolves settings in that order until a value is found.
I believe having some hard-coded, common-sense lookup scheme would be nice, but I fear that many applications would not fit well into it.
I think an additional bit that's important to get right is to track the origin of each setting, so that people don't end up with some_setting = "value", but some_setting = ("value", source) I think this would make it way more transparent to understand and debug where settings come from.