defopt
defopt copied to clipboard
Support command composition
Specific feature request:
I wonder if you could support composition without adding too much complexity. A lot of my CLIs take an optional config argument which takes a path to a YAML file (defaults to ./config.yaml) and loads that YAML file and initializes a logger. I'd love to be able to re-use that code and simply compose it with other scripts. I don't know if that would be possible without sacrificing simplicity though.
This is within the realm of possibility but I'm looking for a clean design. Suggestions are welcome.
Possible design:
from common import config
def main(...):
...
defopt.run(main, setup=config)
When using a single command as above, arguments from config
would be mixed with arguments for main
(this will fail if they share a name). When using multiple commands, arguments for config
go before the subcommand.
python script.py --config config.yaml subcommand args
config
would called before the chosen command.
Hi, I'm the guy you quoted on reddit for this enhancement suggestion. I just now got around to trying to implement something like this using defopt, but I've run into a few problems. Here's my approach:
I wanted a class that would manage layers of config, provide logging, and provide a CLI (I might need to work on Single Responsibility Principal, but we'll ignore that for now):
import script
# subclass script.Script to get all the nifty tools
class MyScript(script.Script):
def work(self, pos_arg, *args, kw_only_arg, opt_arg=1, **kwargs):
"""doc string"""
super().work(*args, **kwargs)
# logging is built in
self.log.info("starting work with: {}".format(locals())
# script.Script is a subclass of collections.ChainMap and all config is stored in self
host = self.get("host", "localhost")
# config is layered and comes from several places including an optional --config-path cli argument
...do stuff
if __name__ == "__main__":
MyScript().run()
Here's how I'm going about it so far:
import collections
import logging
from logging import config as logging_config
import copy
import defopt
from . import util # this is a place holder for various helper functions
class Script(collections.ChainMap):
"""
Base class for configured, logged, command-line scripts.
Config layers are prioritized by how close to run-time they are set, e.g. config parameters passed in
through the command line take priority over config parameters specified in a file.
Attributes:
cls_config (Dict[str, Any]): default config specified as a class-level attribute (lowest priority)
obj_config (Dict[str, Any]): config passed in to the __init__ method (low priority)
pth_config (Dict[str, Any]): config loaded from a yaml file (high priority)
cli_config (Dict[str, Any]): config key-value pairs from the command line (highest priority)
log (logging.Logger): a logger
default_config_path (str): fallback path to look for a yaml config file
reserved (Set[str]): command line parameter names that are reserved
"""
cls_config = {}
default_config_path = "./config.yaml"
reserved = frozenset(["-c", "--config-path", "-h", "--help"])
def __init__(self, **obj_config):
super().__init__()
self.cli_config = {}
self.maps.append(self.cli_config)
self.pth_config = {}
self.maps.append(self.pth_config)
self.obj_config = obj_config
self.maps.append(self.obj_config)
self.maps.append(self.cls_config)
self.log = logging.getLogger(type(self).__name__)
self.config_log("basic logging enabled")
def config_log(self, message=""):
"""configure or reconfigure the logger, then log the given message (if any)"""
config = self.get("logging")
if config:
logging_config.dictConfig(config)
self.log = logging.getLogger(type(self).__name__)
if message:
self.log.info(message)
def run(self, argv=None):
self.log.info("running")
argv = sys.argv[1:] if argv is None else argv
self._load_path_config(argv)
# walks the MRO to build a function with a complete signature and doc-string for a method
work = util.flatten_method(self.work)
# automatically generates short aliases for parameters (excluding any reserved names)
short = util.auto_short(work, reserved=self.reserved)
# removes unknown args from argv and attempts to build a kwargs dict out of them
cli_config = util.filter_kwargs(argv, work, short)
self.cli_config.update(cli_config)
self.config_log("all config loaded and logger fully configured")
try:
defopt.run(work, argv=argv, short=short)
except KeyboardInterrupt:
self.log.info("KeyboardInterrupt received. Exiting")
except:
self.log.exception()
raise
def work(self):
"""Override this method with custom script logic"""
self.log.info("working")
def _load_path_config(self, argv):
self.log.info("loading path config")
config_path = self.default_config_path
if {"-c", "--config-path"} & set(argv):
i = argv.index("-c") if "-c" in argv else argv.index("--config-path")
argv.pop(i) # remove the key
config_path = argv.pop(i) # remove the value
self.log.info("attempting to load config from %s", config_path)
try:
with open(config_path) as f:
pth_config = yaml.load(f)
self.pth_config.update(pth_config)
except FileNotFoundError:
self.log.warinig("config file not found at %s", config_path)
else:
self.config_log("config file loaded")
OK, so hopefully most of that makes at least a little sense. The magic/troublesome bits are the util
helper functions: flatten_method
, auto_short
, and filter_kwargs
. flatten_method
is kinda complicated and might not be a worth-while addition (I haven't quite gotten it to work yet), I might cover it in a different proposal. auto_short
is pretty simple:
import inspect
def auto_short(func, reserved=frozenset()):
"""
Build dictionary of shortened arguments for the given function
Args:
func (callable): the function from which the param list will be derived
reserved (Set[str]): a set of reserved words
Returns (Dict[str, str]): maps original param names to short versions
"""
lowers = set("-"+c for c in string.ascii_lowercase) - reserved
uppers = set("-"+c for c in string.ascii_uppercase) - reserved
short = {}
unshortened = []
# pass one: try to map as many params to the proper case
for name in inspect.signature(func).parameters:
c = "-" + name[0]
if c in lowers | uppers:
short["--" + name.replace("_", "-")] = c
lowers.discard(c)
uppers.discard(c)
else:
unshortened.append(name)
# pass two: try matching swapped cases for unshortened params
for name in unshortened:
c = "-" + name[0].swapcase()
if c in lowers | uppers:
short["--" + name.replace("_", "-")] = c
lowers.discard(c)
uppers.discard(c)
return short
filter_kwargs
is also pretty simple, but it relies on some of the internal workings of defopt, so it makes a better candidate for an enhancement to defopt rather than a stand-alone tool:
import defopt
def filter_kwargs(argv, func, short=None):
"""
pre-filter the given argument vector such that only valid arguments remain
return the invalid arguments as a dictionary of pairs.
Args:
argv (List[str]):
func (callable):
short (Dict[str, str]):
Returns (Dict[str, str]):
"""
short = short or {}
parser = defopt.ArgumentParser(formatter_class=defopt._Formatter)
parser.set_defaults()
defopt._populate_parser(func, parser, None, short)
_, unkonwns = parser.parse_known_args(argv)
if len(unkonwns) % 2:
raise ValueError(
"odd number of remaining args preventing proper key-value pairing: %s", (unknowns,))
kwargs = {k.lstrip("-").replace("-", "_"): v
for k, v in zip(unkonwns[::2], unkonwns[1::2])}
for arg in unkonwns:
argv.remove(arg)
return kwargs
In order to attempt casting kwargs to reasonable types, I also use a helper function: yamlize
. It converts the Dict[str, str] to a yaml-loadable representation so that the yaml parser can guess the types:
import yaml
def yamlize(d):
"""
Treat a dict mapping strings to strings as though it was defined in a yaml file so that the
values can be converted to objects according to the yaml parser.
Args:
d (Dict[str, str]): a dict of mapping keys to serialized values
Returns (Dic[str, Any]): the yaml-interpreted version of d
Example:
>>> d = {"a": "123", "b": "hello"}
>>> yd = yamlize(d)
>>> assert isinstance(yd['a'], int)
>>> assert isinstance(yd['b'], str)
"""
s = "\n".join("%s: %s" % i for i in d.items())
return yaml.load(s)
Finally, I noticed that keyword-only arguments that don't have a default value are treated as positional arguments. I think that's a bug.
Anyway, I hope that all makes some sense, clears up my use case, and maybe gives you some ideas about a decent way to handle kwargs. This is a cool project. It's a joy to use. Thanks for your awesome tool!
A similar approach to your proposed design could be:
import script
widget = script.Script() # can be used for logging and config management, but separate from main
def main(a, *b, c, d=1, **e):
"""..."""
widget.log.info("starting main")
host = widget.get("host", "localhost")
...do stuff
if __name__ == "__main__":
# widget has a configure method which will add params to the parser (e.g. '--config-path')
# it'll throw a ParameterConflict exception if the params conflict with other mixins or functions
defopt.run(main, mixins=[widget.configure])
Thanks for the detailed comments! I've just skimmed them for the moment but I'll come back and have a proper look in the next few days.
I've opened #13 for keyword-only arguments without defaults. I had no idea you could even do that, so it's a pure accident that it works the way it does currently.
The reason I initially didn't want to automatically pick short flags is that adding a parameter to any of the called functions could then unintentionally steal a short flag from another argument. I like the idea of auto_short
making this sort of behavior explicit. I'll think about how best to integrate this or some equivalent.
I've thought about this a lot more. Most of the little things I suggested like auto_short
are bad ideas. They might be useful as utilities or something, but in general they don't belong in the core implementation for the reasons you've stated.
I've also realized that argparse
has pretty much exactly what I want. You can compose parsers by adding parents, add sub-parsers (though, the interface for this is a little clunky), and even allow abbreviations (if **kwargs support gets implemented it should probably raise an error if your parser allows abbreviations and kwargs at the same time). I would keep the defopt.run
function pretty much the way it is, but change the implementation to use a subclass of argparse.ArgumentParser
under the hood and allow access to those parsers.
It could look like so:
class Parser(ArgumentParser):
def __init__(self, *args, **kwargs):
# I would rename 'parsers' to 'deserializers' or something to avoid confusion
self.deserializers = kwargs.pop("parsers", []) or kwargs.pop("deserializers", [])
# I would rename 'short' to 'aliases' and allow it to map a name to an iterable of stings
self.aliases = kwargs.pop("short", {}) or kwargs.pop("aliases", {})
self.func = kwargs.pop("func", None)
kwargs.setdefault("formatter_class", _Formatter)
super(Parser, self).__init__(*args, **kwargs)
# handle to subparsers object if one is created
self._subs = None
if self.func is not None:
self.populate(self.func)
def populate(self, func): # replaces _populate_parser
_populate_parser(func, self, self.deserializers, self.aliases)
def add_argument(self, *args, **kwargs): # replaces _add_argument
# it may be a good idea to allow a deserializer here to override the default deserializer
name = kwargs.pop("name")
positional = kwargs.pop("positional", False)
_add_argument(self, name=name, short=self.aliases, _positional=positional)
def add_subparser(self, func):
if self._subparsers is None:
self._subs = self.add_subparsers(title="subcommands")
if isinstance(func, ArgumentParser):
raise NotImplementedError("there should be a way to add a parser object to subparsers")
else:
subparser = self._subs.add_parser(
func.__name__,
formatter_class=self.formatter_class,
deserializers=self.deserializers,
aliases=self.aliases,
func=func)
subparser.set_defaults(_func=func)
def parse_ak(self, argv=sys.argv[1:]):
"return (args, kwargs) instead of a namespace?"
arg_namespace, unknowns = self.parse_known_args(argv)
# s
if not all(a[0] in self.prefix_chars for a in unknowns):
raise Exception()
if len(unknowns) % 2:
raise Exception()
kwargs = dict(zip(unknowns[::2], unknowns[1::2]))
# move all non-positional arguments from arg_namespace to kwargs, the rest go to args
raise NotImplementedError()
# deserialize arguments
raise NotImplementedError()
return args, kwargs
def run(self, argv=sys.argv[1:]):
args, kwargs = self.parse_ak(argv)
# run any parent parsers
raise NotImplementedError("don't know how exactly to do this")
# run self.func if it exists else kwargs["_func"]
func = self.func or kwargs.pop("_func", None)
if func is None:
raise Exception("no function to run")
return func(*args, **kwargs)
That's all I've got so far. I'm not 100% how parents works with an ArgumentParser
, it looks like they're added as container_actions
(not sure what that means).
I can try to implement this in a feature branch and so that it's a little more fully baked.
Do you have some examples in mind for how this would be used and what the resulting command line would look like?
I think the main problem I have with zipping up the extra arguments into **kwargs
is that it's difficult to emulate argument parsing in exactly the same way argparse does it. For example, does --arg1 --arg2
become {'arg1': '--arg2'}
or does it raise an error? I can imagine that getting fairly fiddly and I'm not yet convinced much is gained from it.
Having said that, I do like the idea of having a class to allow for more tweaking of the internals that you're after. Then it might be relatively simple to add **kwargs
handling for your own project.
If you do end up working on this in a branch, keep me updated.
Sorry it took me so long to get back to you. My thought is that you don't have to make a decision about how specifically to handle *_kwargs now or really ever. It would be nice to have a way to add a *_kwargs handler to a parser. The interface for a **kwargs handler is pretty simple: take a list of strings and return a keyword arguments dictionary. Argument parser has a function to produce that list of strings: arg_namespace, unknowns = self.parse_known_args(argv)
If there is a kwargs_handler and the function has kwargs argument, pass unknowns
into the handler and unpack the result into the function call. Otherwise raise an error.
I will for sure keep you updated on any work I do on defopt.
I'll work on a branch that implements some of the things i'm talking about so it's more clear
Adding an unknown_args_handler as suggested in https://github.com/anntzer/defopt/issues/7#issuecomment-233028244 seems reasonable, e.g. along the lines of
--- i/lib/defopt.py
+++ w/lib/defopt.py
@@ -154,7 +154,11 @@ def run(funcs: Union[Callable, List[Callable]], *,
show_defaults=show_defaults, show_types=show_types, version=version,
argparse_kwargs=argparse_kwargs)
with _colorama_text():
- args = parser.parse_args(argv)
+ if unknown_args_handler:
+ args, rest = parser.parse_known_args(argv)
+ args = unknown_args_handler(args, rest)
+ else:
+ args = parser.parse_args(argv)
# Workaround for http://bugs.python.org/issue9253#msg186387
if not hasattr(args, '_func'):
parser.error('too few arguments')
although this likely needs kwargs support (#3, or rather at least allow them to be ignored on the CLI side) to be really useful (so that the handler can e.g. add more kwargs).
I'll leave the implementation to anyone who has a real use for this, though.
I'm going to consider that this is effectively provided by defopt.bind_known.