clap icon indicating copy to clipboard operation
clap copied to clipboard

Short options with arguments conflicts with `allow_hyphen_values`

Open 197g opened this issue 3 years ago • 4 comments

Please complete the following tasks

Clap Version

3.0

Describe your use case

The program GNU seq requires us to parse floating point values as arguments, which may be negative or have formatting different from the standard <f64 as FromStr> style. At the same time it accepts some short options to control the output. The problematic case is combining those two: we must utilize AllowHyphenValues while also accepting short option values.

$ seq -s, -1 2
-1,0,1,2
$ # Demonstrating non-f64-style arguments that must also match positional values:
$ seq -0x.ep-3 -0x.1p-3 -0x.fp-3
-0,109375
-0,117188

This is solved in GNU by external iteration over the arguments where seq itself decides what constitutes a flag, and what starts the value arguments: https://github.com/coreutils/coreutils/blob/master/src/seq.c#L593-L604

In clap, however, when AllowHyphenValues is active then only short options character are allowed to appear in an option. Otherwise, it is interpreted as a position argument. See https://github.com/clap-rs/clap/blob/5c3868ea4cb8063731d8526e8e97414942a987ae/src/parse/parser.rs#L994-L995

Describe the solution you'd like

Now, I would propose to add a new setting that slightly modifies these rules. Where an hyphenated argument may start with short options but where short option parsing stops at the first value-taking option, and any other characters are permitted when such an option is recognized. That is, in this context, where the short options w and s exist, -ws, would be interpreted as:

  1. w matches a short option without argument, continue.
  2. s matches a short option that does take an argument, switching to value mode because this option takes a value.
  3. - is ignored as part of the value.
  4. The argument -ws, is interpreted as options.

Alternatives, if applicable

A more intricate solution would be to enable external iteration, where arguments are consumed by clap one-by-one. That is, more like the GNU style parsing loop where some get_first_argument(iterator) consumes only some arguments from the iterator but returns control to the caller after the first option has been consumed. This could enable an iteration loop in the style of GNU seq where the arguments are pre-tested by the program logic on whether they constitute a positional argument; allowing us to break manually instead of require AllowHyphenValues to control clap to do this internally.

Additional Context

197g avatar Feb 06 '22 17:02 197g

Sorry, I forgot to fill out additional context: It's mentioned in uutils/coreutils: https://github.com/uutils/coreutils/pull/3081

197g avatar Feb 06 '22 17:02 197g

Now, I would propose to add a new setting that slightly modifies these rules. Where an hyphenated argument may start with short options but where short option parsing stops at the first value-taking option, and any other characters are permitted when such an option is recognized

I think this makes sense

A more intricate solution would be to enable external iteration, where arguments are consumed by clap one-by-one. That is, more like the GNU style parsing loop where some get_first_argument(iterator) consumes only some arguments from the iterator but returns control to the caller after the first option has been consumed. This could enable an iteration loop in the style of GNU seq where the arguments are pre-tested by the program logic on whether they constitute a positional argument; allowing us to break manually instead of require AllowHyphenValues to control clap to do this internally.

This sounds more like lexopt. We have plans to modularize clap so we provide a lexopt-like crate and allow you to use all of the other parts of clap built up together.

epage avatar Feb 07 '22 15:02 epage

If I'm reading this correct, this is the same issue as #6018. A more modern reproduction case is

#!/usr/bin/env nargo
---
[dependencies]
clap = { version = "4", features = ["debug"] }
---

use std::ffi::OsStr;

use clap::Command;
use clap::arg;


fn main() {
    let mut cmd = Command::new("build")
        .arg(arg!(-P --parallel <N>))
        .arg(arg!(<ARGS>...).trailing_var_arg(true).allow_hyphen_values(true));
    cmd.build();
    dbg!(&cmd);

    let matches = if false {
        cmd.get_matches_from(vec!["build", "-P", "32", "foo", "--bar"])
    } else {
        cmd.get_matches_from(vec!["build", "-P32", "foo", "--bar"])
    };

    dbg!(&matches);

    assert_eq!(matches.get_one("parallel"), Some(&String::from("32")));
    let mut args = matches.get_raw("ARGS").unwrap().into_iter();
    assert_eq!(args.next(), Some(OsStr::new("foo")));
    assert_eq!(args.next(), Some(OsStr::new("--bar")));
}

(if true will succeed, if false will panic).

epage avatar May 27 '25 17:05 epage

What I'm wondering is why we didn't consider just changing the behavior so that a short that takes a value is always considered a short, rather than a MaybeHyphenValue.

epage avatar May 27 '25 17:05 epage