swift-argument-parser Add support for params files

Some tools/situations end up needing so many arguments that they can run in to the max command line length. This is usually handled via argument files or parameter files -- if an argument is prefixed with a marker (usually @), then the contents of the referenced file are read at that location in the command line as if the content were on the command line. This also usually works recursively in that the read file could include another one of these directives to read yet another file.

There can be two types of "modes" that are supported for reading these files:

multiline : Every line of the files is a single argument. This is the easy one, as it means spaces, etc. are part of the arg, there is no need to understand escaping, just split on newlines and handle things.
shell quoted : The files are parse like a Bourne shell would, this requires handing quotes and escapes. swift and clang support files in this form.

Reference:

Python ArgParse support: https://docs.python.org/3/library/argparse.html#fromfile-prefix-chars
Bazel docs with notes about generating files for these two modes: https://docs.bazel.build/versions/2.0.0/skylark/lib/Args.html#set_param_file_format

Mar 02 '20 16:03 thomasvl

This is something we've been thinking about for swift-doc, and it'd be great to have consensus around how this should work.

@thomasvl Thanks for sharing those references for parameter files. Coming from more of a Ruby JavaScript background, I'm less familiar with this approach; in those languages, the more common pattern seems to be configuration files encoded as JSON / YAML / TOML.

I'm curious to hear whether we see these two approaches (parameter file and configuration file) as alternatives or compliments to one another. Is it possible or indeed desirable to support both at once?

Apr 23 '20 15:04 mattt

My $0.02 is that params files are more meant as a way around OS arg limits and generally dealing with lots of args (especially file lists), so they reuse the main argument parsing logic, just using a (hopefully simple) tokenizer out of the file to supplement what comes on argv.

To me, configuration files tend to be a full replacement for the the command line flags. Instead of just a list of flags, it is a more structured form of all the args, usually allowing slightly better grammar since you scan do sub objects, etc. instead of making really long arg names. So while the command line might have --foobar=1 and --foobaz=2 when you go to a config file, it could have something like foo: { bar: 1, baz: 2 }; or --mumble 1 --mumble 2 it can be: mumble: [1, 2].

I can't say I've seen something automatically bridge args and config files; but it could be possible.

Apr 23 '20 16:04 thomasvl

Config files are a really interesting question, too, @mattt! I think parsing those files (as opposed to params files) is more akin to the environment variables work in #109. Both config files and environment variables really need to act as defaults, with precedent given to the command-line arguments given at call-time. (I don't know about precedence between config files and the environment, though.)

All that is to say I think config files could be supported, but it depends on whether there's a design that wouldn't compromise the rest of the goals for the library.

Apr 24 '20 17:04 natecook1000

@thomasvl That's an excellent distillation of the problem. Your point about translating 1D argv to 2D nested data structures is giving me flashbacks to my struggle to find a cohesive scheme for mapping URI query parameters in AFNetworking and Alamofire.

@natecook1000 That's a great point about environment variables — I hadn't made that connection until now. I wonder if anyone's written a "Grand Unifying Theory" for command line executables that reconciles environment variables, .env files, configuration files, argument files, and arguments... I suspect it'd look an awful lot like the User Defaults System

I'll think a bit more about how this might be done in a way that fits in with the design goals of the library. Supporting configuration files may not require a "Grand Unifying Theory" per se, but I agree that it's something that we can't rush into without understanding its role in the overall architecture.

Apr 24 '20 17:04 mattt

When @mattt suggested configuration files, I was thinking an arg to read a configuration file, so I guess one question here is would configuration files be auto read based on name and the current directory (like environment variables), or would they be explicit via an option.

This can comes into play for how they interact with the other command line options. For params files, since they are listed on the command line, the exact order in the command line is clear; but with configuration files, it will depend on exactly how they end up being implements/controlled.

Apr 24 '20 18:04 thomasvl

@thomasvl Yeah, it's another unfortunate bit of complexity we'd have to account for. What I've typically seen is that tools look for a configuration file in the current directory with a particular name (e.g. .swiftlint.yml) but also provide an option to pass a custom path (e.g. to allow for .swiftlint.production.yml or .config/swiftlint.yml). In either case, any command line options trump whatever's in the configuration file, and in practice this seems to work pretty well (it only really gets complicated when libraries decide to support multiple names or file formats)

Apr 24 '20 18:04 mattt

@mattt so it becomes a two pass decision?

Check/scan command line to see if an explicit configuration is given?

No configuration found
1. Check $CWD for a standard configuration file, load it (.swiftlint.yml).
2. Process the command line normally
Explicit configuration found
1. Do NOT check for/read a file from $CWD
2. Read/Load the explicit configuration file (.swiftlint.production.yml)
3. Process the rest of the command line

With the question around ordering of 2.ii. & 2.iii. When does the config file get read? Before all the args or at the spot of explicit argument that set its name.

Apr 24 '20 19:04 thomasvl

That's close, but not quite what I meant. Though I can't speak for how any of these tools are implemented; this is just how I'd expect such a system to behave:

Process command line arguments and keep track of which non-positional arguments were passed explicitly
If there's an option to specify a configuration file, and a value is either provided explicitly or through a default, try to read those values. (To clarify: configuration file values correspond only to labeled, not positional arguments)
For each non-positional argument of the command, use a constraint system to determine which value takes effect according to the following ranking:
1. explicit command line arguments (these always override any other values)
2. environment variables (for options that specify .environment or .customEnvironment names from #103)
3. configuration files
4. default values

Apr 24 '20 20:04 mattt

@mattt That looks like the right hierarchy to me, though tools should probably avoid allowing both config files and environment settings. Shaking my head thinking about the debugging nightmares for folks with two barely visible places for defaults....

May 07 '20 02:05 natecook1000

Shaking my head thinking about the debugging nightmares for folks with two barely visible places for defaults....

Is there any convention for a command line tool to print out the complete command it would synthesize (from environment and config files) if it were executed?

May 07 '20 16:05 kylemacomber

@kylemacomber The closest I could come up with is the --dry-run option in git commit and other commands. Only other thought I had was how some networking tools have a "copy as curl command" option (not entirely relevant, but still nice).

If no direct precedent exists, I think we'd do well to offer one here. @natecook1000's spot-on about the potential for confusion here.

May 07 '20 17:05 mattt

git-config can show the origin and scope of configuration variables.

e.g. git config --show-origin --list
e.g. git config --show-scope --list (available since version 2.26).

May 07 '20 17:05 benrimmington

Warning CLI newb here...

I really like the name --dry-run for this (i.e. "print out the complete command it would synthesize (from environment and config files) if it were executed"). Of course it might be nice to allow folks to override or augment this behavior if they want to provide a richer git-like output.

May 07 '20 22:05 kylemacomber

Can we reopen this discussion? This issue is relevant, however it has been open for over 3 years and has been inactive for almost as long.

I understand there's was an ongoing discussion of the desired solution, but it seems the engineering shouldn't be too complicated, especially considering ParsableCommand and AsyncParsableCommand already conform to Decodable. I've tried decoding those from a JSONDecoder, but it seems that the property wrappers don't play well with any top-level decoders other than ParsableArguments.

I was able to make a very rough workaround where I have a JSON config file that gets sent as the list of arguments to the command through the static func parse() factory method, but that is far from ideal, even though it works.

Apr 06 '23 15:04 fjcaetano

My $0.02 would be the initial param files I described model what libraries for other languages do and is reasonable scoped (and doesn't need any Decoders). It also can be a requirement for tools that have to take a lot of arguments as you can run into command line lengths - a bunch of the swift tools already have to support this concept.

Something else that's structured, feels like something that could also be done, but probably should be separate from the params files concept as it starts to server different/more complex usages.

Apr 06 '23 15:04 thomasvl

swift-argument-parser swift-argument-parser copied to clipboard

Add support for params files

swift-argument-parser
swift-argument-parser copied to clipboard