argparse Add some form of iterable access to `ArgumentParser` subparsers

This issue is being reported from the discussion had on #212.

The desired behavior is to have some form of access to the subparsers of ArgumentParser. This would allow the library consumer more fine-tuned control over desired behavior after the parsing logic has completed.

For example, a consumer may want to require the user to provide a subcommand to trigger any meaningful business logic. Access to the ArgumentParser subparsers would allow simpler logical checks:

// setup "program" as top parser, add several subcommands,
// then call program.parse_args(...)

if (!program || std::none_of(program.subparsers.begin(), program.subparsers.end(),
    [](const auto& parser){ return static_cast<bool>(*parser); })
{
    std::cout << program;
    std::exit(1);
}

NOTE: the above leverages ArgumentParser::operator bool discussed in #212

Some discussion would need to be had about exactly how to provide access to the subparsers (e.g., maybe only const access through a getter rather than direct member access, or perhaps only through cbegin() / cend() iterators).

Oct 07 '22 18:10 bwpge

Rather than providing a sequence of all subparsers, I believe it makes more sense to have named, direct access; that is, reusing the ArgumentParser::get interface.

argparser::ArgumentParser program;

argparser::ArgumentParser add_command("add");
add_command.add_argument("file");

program.add_subparser(add_command);

program.parse_args(argc, argv);

auto file = program.get<argparser::ArgumentParser>("add").get("file");

Your thoughts?

Oct 11 '22 20:10 skrobinson

I think the above suggestion definitely works for its purpose of exposing parsed details of subparsers, however it doesn't really solve the friction point of abstractly checking if any subparser was used (meaning, it would still require checking a hard-coded list of subparser names to get that information). I think exposing a sequence of subparsers allows for implementing more extensible behavior by the library consumer for custom business logic.

However, thinking about it, this might be better suited for inheritance (e.g., the library consumer could simply implement their own CustomArgumentParser and just work with that instead). This would require making the ArgumentParser members protected instead of private though, not sure how much that affects the design philosophy of the library.

Oct 12 '22 01:10 bwpge

I think exposing a sequence of subparsers allows for implementing more extensible behavior by the library consumer for custom business logic.

I originally thought the same and tried out returning an iterator over the subparsers, but found this prompted more syntax for the consumer without added benefit. But, this may be a failure of imagination on my part.

Can you give a scenario where having a sequence of subparsers is meaningful? Can the sequence elements be used without knowing exactly which subparser is acted on?

// assume subparsers() returns only used subparsers
for ( auto &parser : program.subparsers() ) {
    if ( parser.name() == "add" ) { /* add files */ }
    if ( parser.name() == "commit" ) { /* commit files */ }
}

versus

if ( auto parser = program.get<ArgumentParser>("add"); parser) { /* add files */ }
if ( auto parser = program.get<ArgumentParser>("commit"); parser) { /* commit files */ }

If a consumer needs to identify the subparser before using it, I don't see how a sequence of subparsers helps.

Oct 12 '22 15:10 skrobinson

Perhaps we could provide an interface to register a visitor function for subcommands:

// setup "program" as top parser, add several subcommands,

bool any_subparser_used{false};

program.on_subparser([&](const ArgumentParser& parser) {
  any_subparser_used = true;

  const auto command = parser.name();

  if(command == "add") { 
    handle_add_command(parser);
  }
  else if (command == "commit") { 
    handle_commit_command(parser);
  }
});

program.parse_args(/* ... */);

if (!program || !any_subparser_used) {
  std::cout << "Error: expected subcommand\n";
  std::cout << program;
  std::exit(1);
}

This provides:

the answer to "Has any subparser been invoked?" and "What command did the user use?"
an interface (based entirely on the command name) to route the business logic to handler functions for each subcommand at each parser level

Visitors could be registered for every parser and subparser and so the behavior should propagate down to the lowest level, e.g.,

foo:bar$ ./program submodule update

might first call the visitor for command == "update" (inner-most subparser) and then the visitor for command == "submodule" (the only subparser to the top-level program).

I'm not sure what I am missing here with this.. Thoughts?

Oct 12 '22 17:10 p-ranav

@skrobinson for this part:

If a consumer needs to identify the subparser before using it, I don't see how a sequence of subparsers helps.

Totally agreed at some point the consumer needs to identify the subparser. However, there could be some further advanced subcommand validation logic they want to implement before actually "processing" or "handling" the command. I think I've beaten the "has a subcommand been used?" example to death so I won't belabor it :)

@p-ranav I think your suggestion nails it. As @skrobinson mentioned above, the consumer eventually needs to identify the subparser, so assigning a handler based on the name is totally reasonable. This visitor pattern allows processing whatever extra logic the consumer requires, and also allows tweaking that for different (sub)parsers.

Oct 12 '22 19:10 bwpge

@bwpge

However, there could be some further advanced subcommand validation logic they want to implement before actually "processing" or "handling" the command.

I have a hard time imagining how one would run validation on an unidentified object.

@p-ranav In this example, isn't the action too far from the subject? program.on_subparsers seems to run the subparsers' internals.

What about a per (sub)parser handler?

ArgumentParser add_command("add");

add_command.on_parse([&]() { });  // or add_command.action

This seems more closely aligned to the relationship Argument has with ArgumentParser.

Oct 13 '22 19:10 skrobinson

I have a hard time imagining how one would run validation on an unidentified object.

Just for a personal example to maybe make this more concrete:

My team (working on various platform integrations) is building several libraries that wrap web API calls for quite a few tools my wider group utilizes. Each of these "tools" have very complicated APIs with lots of boilerplate and different ways of making their requests (such as with XML, JSON, weird handshake processes, etc.) and many different ways of responding with results, errors, etc. The libraries will abstract these API calls and processes to be simpler and more unified (e.g., foolib1::get_report(...), foolib2::get_report(...), etc.).

One idea I had to make building different utilities for these libraries much simpler was to have each library implement their own ArgumentParser -- meaning, they would provide their own library subcommand. This would maybe be adjacent to providing a main module __main__.py in Python (not quite the same thing). This would allow other teams to reuse parsers as part of their own projects, and they could get a consistent command line interface if they wanted to expose that for their particular project. This would also make building the utilities themselves fairly trivial (e.g., program.add_subparser(foolib::main_parser)).

For a simpler "handler" interface, I thought to have the libraries implement a bool handle_parser(const ArgumentParser&) function or something similar. The function would do nothing and return false if it doesn't recognize the parser or handle the parser and return true if it does. That's why I mentioned the previous suggestion about a visitor pattern would solve my issue, since I could just implement this logic in the lambda for dispatching.

Requiring each library to provide a string of their command name would certainly still be doable for the above, but it's not necessary for my use case to check command names (other than as a constraint of the library). Not the end of the world of course.

I'm happy to take critiques on the above use case, I realize it might not be the best solution...

Regarding add_command.on_parse(...):

This would definitely solve my problem, and actually might be even better since it would eliminate any need to check parser names. I think this would also simplify subparsers for general library consumers who don't have psychotic use cases like myself... :)

Oct 13 '22 21:10 bwpge

Thank you for taking the time to describe your use case. It is nowhere near my own uses of argparse, but with your explanation, I have a different perspective that I'll keep in mind while working on argparse.

I'm thinking to turn your description into a test case to use while working on .on_parse.

Oct 14 '22 20:10 skrobinson

argparse argparse copied to clipboard

Add some form of iterable access to `ArgumentParser` subparsers

argparse
argparse copied to clipboard