command-line-api icon indicating copy to clipboard operation
command-line-api copied to clipboard

Consider optional case insensitive parsing

Open jonsequitur opened this issue 6 years ago • 37 comments

Developers should have the option to create command lines where commands and options are matched in a case insensitive manner. This matches existing expectations for PowerShell users, and even for POSIX users, this may make it easier for them to do the right thing.

jonsequitur avatar Jun 06 '18 04:06 jonsequitur

I think this is a good idea and we should continue to get feedback on it.

My windows Command Prompt initial mindset (where I internalized CLI years ago) is definitely case insensitive.

KathleenDollard avatar Jun 20 '20 15:06 KathleenDollard

As I posted on #969, it's not clear to me that we should do this. Command lines are typically case sensitive on Linux/macOS, and often on Windows as well. Could case insensitivity lead to an inconsistent ecosystem that will confuse people? When I'm using a new command line tool, how will I know whether it's case sensitive or not?

jonsequitur avatar Jul 09 '20 14:07 jonsequitur

How about give more flexibility to let the consumer decide it? Maybe, theirs command lines are cross platform. And also maybe theirs are windows only. Even theirs existing apps are case insensitive and would like to move to this great package. We are encouraging making the command lines case sensitive. I'm not sure if MSFT will change cmd and powershell to case sensitive. For some reason, if they want to make theirs command lines case insensitive but we don't support it. Ultimately, they will look for other solutions which support that but don't change theirs idea.

baochenw avatar Jul 10 '20 13:07 baochenw

If we allow case insensitive matching, that that opens up the possibility of an invocation failing do to an ambiguous match. That is, when there are more than one possible symbols could potentially match a given token. A few things worth discussing further:

  • How would case insensitivity be toggled? If it is controlled by the developer this could simply be an API addition. If this were control by the CLI consumer, it would likely need to be controlled by something like a directive.
  • How to handle error caused by case insensitive matching? I suspect this would need to be handled in a similar manner to an existing parse error, with some message indicating that it failed due to matching multiple symbols.
  • How to handle posix bundling? Right now we support bundling of arguments, allowing case insensitive matching adds additional complexity here.
  • How to handle symbol matching? Unforuently this is not always simple with nested commands since items may be specified in any order. Should it be consider ambiguous if there is a case-insensitive match on the current command, but an exact match on a parent command?

Keboo avatar Aug 27 '20 23:08 Keboo

I would very much like to see providing the developer with a choice of whether they want their CLI to behave in a case-sensitive or in a case-insensitive manner. I prefer using camelCase or PascalCase for (longer) option names, even when also employing kebab-case, as i find this more readable in the help dump than using snake-case or kebab-case alone. But i do not want to burden the user with painstakingly observing and obeying the letter case.

While i have no deeper insight into where and what a good API for such a choice should look like, i feel like CommandLineConfiguration and CommandLineBuilder (through a Use* extension method) could be good candidates for offering such a choice to a developer employing System.CommandLine.

If we allow case insensitive matching, that that opens up the possibility of an invocation failing do to an ambiguous match. That is, when there are more than one possible symbols could potentially match a given token.

How and where do you see case-insensitivity introducing additional ambiguity? In a case-insensitive mode, the options -c and -C would be equivalent. Having a case-insensitive CLI that has to parse -c -C or posix-bundled -cC would be no different than having a case-sensitive CLI that has to parse -c -c or -cc. System.Commandline already has to deal with the latter, so dealing with the former should be exactly the same thing but based on StringComparer.OrdinalIgnoreCase-based comparisons and lookups. (Or perhaps StringComparer.InvariantCultureIgnoreCase; i don't know what would be the wiser choice if some developer chooses to define option names with non-latin letters.)

  • How would case insensitivity be toggled? If it is controlled by the developer this could simply be an API addition. If this were control by the CLI consumer, it would likely need to be controlled by something like a directive.

In my opinion, the choice and ability to enable case-sensitivity should rest in the hands of the developer only. The developer either designs their CLI to function in a case-insensitive manner, or they don't.

If the CLI is not designed to be case-insensitive, what would be the consequences of the CLI consumer being able to enable case-insensitive parsing through a directive or other mechanism? Doing so would put a burden on the consumer of the CLI to verify that the CLI would still operate in the same way and does not exhibit unforeseen and unexpected changes in the behavior. That would be a pretty though ask for the conumer of the CLI.

On the other hand, if the CLI is expressly designed and behaves in case-insensitive manner to begin with, what would be the actual point of a consumer of the CLI being able to enable case-sensitive mode?

  • How to handle error caused by case insensitive matching? I suspect this would need to be handled in a similar manner to an existing parse error, with some message indicating that it failed due to matching multiple symbols.

Since in case-insensitive mode, -c and -C would be equivalent, defining an option for -c and an option for -C for a case-insensitive CLI should be treated exactly like a developer defining the option -c twice (a situation which System.Commandline already handles).

  • How to handle posix bundling? Right now we support bundling of arguments, allowing case insensitive matching adds additional complexity here.

Where and what is the nature of the additional complexity you see? In case-insensitive mode, a bundle like "-abcCD" would be equivalent to "-abccd" (and which System.Commandline is already able to handle).

  • How to handle symbol matching? Unforuently this is not always simple with nested commands since items may be specified in any order. Should it be consider ambiguous if there is a case-insensitive match on the current command, but an exact match on a parent command?

Don't (re)introduce case-sensitive behavior if the developer has expressly chosen case-insensitive behavior. If a parent command and a sub command feature options with names that only differ in their letter case, then treat those options as having the same name when in case-insensitive mode.

elgonzo avatar Feb 07 '21 12:02 elgonzo

I think the previous comments assumed this would be an end user decision and not a developer decision. I think each approach has its own problems.

For example, possible ambiguities arise if the end user can toggle this setting. We would need a mechanism (preferably at design time, to avoid a perf hit on every run) to ensure that a given configuration would be valid in case-insensitive mode.

If this is a developer decision, on the other hand, how will users know whether a given command line is case sensitive? Will it be confusing for macOS / *nix users to encounter case insensitive command lines?

jonsequitur avatar Feb 09 '21 01:02 jonsequitur

I just hit this. I'm trying to modernize an internal application where its previous command-line parser was a NIH homebrew, and my users expect case-insensitivity. This is a show-stopper.

To my mind, each alias should be defineable as case-sensitive or not, so that single-character aliases can be case-sensitive in their crowded space, but long verbose commands don't need to be.

Honeslty, I'm surprised even this lib is so standardized on the double-hyphen prefix. I'd assume it would have me pick the identifiers and then use whatever prefixing characters are native to the platform.

Pxtl avatar Feb 22 '21 21:02 Pxtl

each alias should be defineable as case-sensitive or not, so that single-character aliases can be case-sensitive in their crowded space, but long verbose commands don't need to be.

You can do this currently by specifying the additional aliases:

var option = new Option<string>(new [] { "-a", "-A" });

I'd assume it would have me pick the identifiers and then use whatever prefixing characters are native to the platform.

Most of our examples use POSIX-style prefixes but you can absolutely choose the prefixes you like. The System.CommandLine HelpOption includes both styles:

https://github.com/dotnet/command-line-api/blob/3e11e14238411495eb3460a35056803c31a5c061/src/System.CommandLine/Help/HelpOption.cs#L8-L14

The .NET command line ecosystem has been trending toward broader use of POSIX for several years, not least because .NET Core tools are cross-platform, and on non-Windows systems, the / character conflicts with file path separators. Consistency across operating systems helps reuse of documentation, script code, and so on.

jonsequitur avatar Feb 22 '21 22:02 jonsequitur

Ah, thanks for the clarification. Although by "conventions of the platform", I was thinking of how Powershell is pretty strongly on single-hyphen, not double-hyphen, for options. That's mostly what I had in mind, not the old slashes.

As for "using multiple cases as aliases" - I actually meant the reverse - that single-char aliases you'd want case-sensitivity there, but case-insensitivity would be more useful for long options where the user can't remember if it's StartDateTime or startDateTime or StartDatetime or startDatetime or startdatetime or what, and generating out every possible combination is non-trivial. And while I've been experimenting with autogenerating a suite of aliases for longer option-names, I found they made the "Help" screen unusable cluttered, unless there's a way to restrict "help" to only list certain aliases?

Pxtl avatar Feb 22 '21 23:02 Pxtl

From an accessibility standpoint, case sensitivity can be a nightmare for dyslexic/dysgraphic users. Having the option when developing to disable case sensitivity (especially on subcommands) would definitely improve my experience with my own tools. Its less important on single letter aliases than it is on longer "--this-is-a-thing" params or subcommands. I don't think this would harm too many *nix users either, given that the case sensitive environment there usually conditions one to not change the casing of flags from the way they learn them/discover them in help.

masample-ms avatar Feb 23 '21 00:02 masample-ms

I love this library, been trying it from F#, it's really nice. 🙂

And, obviously, I'm here because I'd also like case-insensitivity.

If this is a developer decision, on the other hand, how will users know whether a given command line is case sensitive? Will it be confusing for macOS / *nix users to encounter case insensitive command lines?

It should definitely be a developer decision. And what I really, really want is some sort of .ApplyPlatformCaseSensitivity = true. On Windows, where I live, I'll get case-insensitivity. On Linux etc. they'll get case-sensitive.

If not, I'm happy to write the if Windows {...case-insensitive...} else {...case-sensitive...} part myself. Just please give me the chance to.


Beyond that, I want to challenge an assumption underneath all of this: that case-sensitive is somehow a better choice because operating systems that go back to the 1970's did it that way. I just don't agree. And I'm not alone.

It does a disservice to Windows users, who deserve to have their CLI's work the way Windows does. If CLI's are case-sensitive on Windows, that's a UX fail. It feels lazy. And I'd rather ship something case-insensitive and have most Linux etc. users not notice it, than have every Windows user run into errors within the first minute of using my CLI because of case-sensitivity. It's a rude way to meet a new thing you're trying.

You may disagree, and that's great. But, please, as a craftsman, give me the choice of doing UX the way I think it should be done.

It's not enough to make me stop using System.CommandLine, I really do l like it, but it's a gap that I think is important to address.

ScottArbeit avatar Aug 21 '21 01:08 ScottArbeit

Here's my crude way of implementing case-insensitivity as a workaround:

  1. I've made all my option-names all lower-case.
  2. My program opens with
        private static int Main(string[] args)
        {
            ChangeAllOptionNamesToLowerCase(args);


...


        /// <summary>
        /// System.CommandLine is case-sensitive and does not support
        /// case-insensitivity, so //HACK in case-insensitivity by lowercasing
        /// all option-names.
        /// </summary>
        private static void ChangeAllOptionNamesToLowerCase(string[] args)
        {
            for (var i=0; i<args.Length; i++)
            {
                if(args[i].StartsWith("-") || i==0) //i==0 is the command-name.
                {
                    args[i] = args[i].ToLower();
                }
            }
        }

However, this hack only works because my program has a very simple command-name structure - first arg is command-name and all following args are option-names or option-values.

Obviously this is a very crude hack to just mutate all the args, but it works for me.

Pxtl avatar Sep 03 '21 18:09 Pxtl

should it be an option on the parser/command ?

John0King avatar Oct 28 '21 07:10 John0King

I suspect that allowing some commands to be case insensitive while others are case sensitive would make a CLI harder for the end user to understand and learn so I would favor making the entire parser case insensitive when this is used.

jonsequitur avatar Oct 29 '21 17:10 jonsequitur

@jonsequitur I came upon this as I need to replace an implementation of CommandLineUtils that used the original .NET Framework 1.1 release. I have been able to figure most of this, as the 2.0 changes change many things to conventions (like ASP.NET), as opposed to the previous hard defined objects like CommandHandler. I do very much like the way that this has been designed, so Hats off to you!

I didn't really see a discussion here on commands vs options, which is where I see a big difference. With options, even on Windows they are usually case sensitive, so I definitely agree with your thoughts there, especially for single letter and -- (double dash) options. However, commands have typically always been case-insensitive, and from my recollection commands are not commonly used in Linux/Unix environments.

In regards to your statement above, that does make sense. I would suggest that there be 2 properties, CaseSesitiveOptions and CaseSenstiveCommands applied to the RootCommand object. Allowing this to change per command, or per option, would be beyond confusing to end users.

Kaelum avatar Feb 25 '22 22:02 Kaelum

I just hit this. I'm trying to modernize an internal application where its previous command-line parser was a NIH homebrew, and my users expect case-insensitivity. This is a show-stopper.

I too was testing switching to this library for a legacy app that has a homebrew solution; I need to maintain the case-insensitive behavior of the legacy app. It would be nice to be able to standardize our console apps on the dotnet one; that would require an api option for this to handle such existing app cases.

Adding overloads where we explicitly specify the equality comparer would be acceptable.

RyanThomas73 avatar May 01 '22 03:05 RyanThomas73

Same feedback from me.. we need case insensitivity for backwards compatibility. Are there plans to address this anytime soon?

Is it just this comparison? https://github.com/dotnet/command-line-api/blob/4dcee4d8a84a5aec7bbee0e9e112a2432e19a411/src/System.CommandLine/IdentifierSymbol.cs#L48

cheenamalhotra avatar May 19 '22 20:05 cheenamalhotra

Question for everyone on this thread. Per the suggestion made by @Kaelum above, would case-insensitive commands meet your backward compatibility needs, or are case-insensitive options also needed? If the latter, is it also your expectation that POSIX bundling should still work?

jonsequitur avatar May 19 '22 20:05 jonsequitur

Atleast for me that would work.. I actually got it working in tests.. Do you want me to contribute? :)

cheenamalhotra avatar May 19 '22 21:05 cheenamalhotra

@jonsequitur for uses that I can see, only the optional use of case-insensitive commands is needed.

For options, I have never seen, nor expected, that "--" (double dash) options be case-insensitive. In cases where there are only a few single letter options, have I seen them be case-insensitive. With the current design, they are easy to code by just specifying both the lower case letter, and the upper case letter, for the option.

Kaelum avatar May 19 '22 21:05 Kaelum

For options, I have never seen, nor expected, that "--" (double dash) options be case-insensitive.

@Kaelum ---prefixed options tend to be found in POSIX-style command line apps, which are typically case-sensitive, so this makes sense. But System.CommandLine doesn't distinguish between different option prefixes, so to be clear, case-insensitive options do exist. PowerShell options (e.g. -Verbose) and MSBuild options (e.g. /p) are case-insensitive. We've avoided case-insensitivity in System.CommandLine because of POSIX conventions, and bundling in particular.

jonsequitur avatar May 19 '22 21:05 jonsequitur

@jonsequitur that was part of the reason that I discussed having the 2 settings, but I also sort of consider PowerShell to be an outlier. Do you intend for PowerShell module creators to use this? Because PowerShell is so different when compared to all other command lines, I am not sure if it is something that should truly be considered. I'm on the fence either way.

If you are considering optional case-insensitive options, maybe have it exclude all options that are prefixed by more than a single "-" (dash)? We do have some applications that use "---" (triple dashes) for development modes, and we've always considered them to the case sensitive.

Kaelum avatar May 19 '22 22:05 Kaelum

Do the POSIX Specs restrict using case-insensitive command line arguments? If not, I think it should be in the interest of the library to make that decision.

cheenamalhotra avatar May 19 '22 22:05 cheenamalhotra

For options, I have never seen, nor expected, that "--" (double dash) options be case-insensitive.

@Kaelum ---prefixed options tend to be found in POSIX-style command line apps, which are typically case-sensitive, so this makes sense. But System.CommandLine doesn't distinguish between different option prefixes, so to be clear, case-insensitive options do exist. PowerShell options (e.g. -Verbose) and MSBuild options (e.g. /p) are case-insensitive. We've avoided case-insensitivity in System.CommandLine because of POSIX conventions, and bundling in particular.

For the single letter "aliases" on options (I'm calling them this because in System.CommandLine it makes sense to implement them with aliases) it seems reasonable if they have to be case sensitive to some degree precisely because of the bundling and POSIX convention. DIYing case insensitivity for those is not hard, my tooling uses this (which also enables us to handle - or /):

public static Option AddSingleCharAlias(this Option option, char aliasChar)
        {
            var upperChar = char.ToUpper(aliasChar);
            var lowerChar = char.ToLower(aliasChar);
            option.AddAlias($"-{upperChar}");
            option.AddAlias($"-{lowerChar}");
            option.AddAlias($"/{upperChar}");
            option.AddAlias($"/{lowerChar}");
            return option;
        }

For both the longer -- syntaxed options (we use those as our main option names) and especially for subcommands its not easy to DIY case insensitivity. You'd effectively need to alias every case permutation. Having the option to force them to match insensitive that was opt in would be desirable, and having to toggle independently for both Options and Subcommands would be fine as well.

masample-ms avatar May 19 '22 22:05 masample-ms

would case-insensitive commands meet your backward compatibility needs, or are case-insensitive options also needed? If the latter, is it also your expectation that POSIX bundling should still work?

I don't think of this in terms of backward compatibility. I think of it in terms of kindness, in terms of being kind to me and my users on Windows who expect case-insensitive parsing of commands and options (and file names). So, yes, I absolutely want a way to deliver case-insensitive options.

I'm happy to forego POSIX bundling in order to get it. I'm also happy to have overloads that allow me to specify the equality comparer, and/or to use #ifdef's or pattern matching to customize my CLI for Windows vs. other platforms.

For both the longer -- syntaxed options (we use those as our main option names) and especially for subcommands its not easy to DIY case insensitivity.

This. I know I can easily create single-dash-single-letter aliases in both cases, but I want case-insensitivity for -- options also.

@jonsequitur I guess the question I really have is: maybe I'm missing something, but what is your resistance to adding a way to do case-insensitive commands and options here? By all means, keep System.CommandLine case-sensitive by default, I'm not arguing with that. Roughly 50% of my expected users will be on Windows and I just want to deliver a platform-consistent experience to them.

ScottArbeit avatar May 19 '22 22:05 ScottArbeit

@ScottArbeit I like your optional StringComparer parameter in the constructer, but I also see @jonsequitur's point. This would significantly complicate the detection and resolution of overlap/duplication issues, which I think is the point here. Going back to my initial proposal with 1 flag for case-insensitive commands, and 1 flag for case-insensitive options, that would address your concerns and it would be much easier to detect overlap/duplication issues.

Kaelum avatar May 19 '22 23:05 Kaelum

Question for everyone on this thread. Per the suggestion made by @Kaelum above, would case-insensitive commands meet your backward compatibility needs, or are case-insensitive options also needed? If the latter, is it also your expectation that POSIX bundling should still work?

Case-insensitive options are also needed. For example we have some existing apps that parse /optionname=<value> and/or /optionname <value> ignoring the option name case. No I wouldn't expect or need POSIX style bundling to work while ignoring case in these backwards compatibility scenarios.

@cheenamalhotra

Is it just this comparison?

At a minimum I believe it would also required updates at these locations:

Constructor overload to specify the equality comparison used here: https://github.com/dotnet/command-line-api/blob/4dcee4d8a84a5aec7bbee0e9e112a2432e19a411/src/System.CommandLine/IdentifierSymbol.cs#L14

And here: https://github.com/dotnet/command-line-api/blob/4dcee4d8a84a5aec7bbee0e9e112a2432e19a411/src/System.CommandLine/Option.cs#L136

As well as a way to customize the the comparer used in this extension method. Maybe a property off the Command that the extension method is targeting: https://github.com/dotnet/command-line-api/blob/4dcee4d8a84a5aec7bbee0e9e112a2432e19a411/src/System.CommandLine/Parsing/StringExtensions.cs#L442

RyanThomas73 avatar May 19 '22 23:05 RyanThomas73

Do you intend for PowerShell module creators to use this? Because PowerShell is so different when compared to all other command lines, I am not sure if it is something that should truly be considered. I'm on the fence either way.

It's not a goal, but it's sometimes brought up as contributing to command line users' confusion that some things are case sensitive and other things are not.

Do the POSIX Specs restrict using case-insensitive command line arguments? If not, I think it should be in the interest of the library to make that decision.

Most are case sensitive by convention at least, maybe because of the *nix origins. For example, git clean -x and git clean -X do different things.

jonsequitur avatar May 20 '22 00:05 jonsequitur

I guess the question I really have is: maybe I'm missing something, but what is your resistance to adding a way to do case-insensitive commands and options here? By all means, keep System.CommandLine case-sensitive by default, I'm not arguing with that. Roughly 50% of my expected users will be on Windows and I just want to deliver a platform-consistent experience to them.

I understand that you're not pushing for a change to the default behavior (though I think it might be unobtrusive to do so for commands.)

Options are more complicated because case insensitivity might make certain things harder for end users to understand. In short, I think it comes down to a choice that the developer makes, and the end user then needs to be able to understand from one tool to another, between either

  • case-insensitive options or
  • support for POSIX bundling.

When we give developers this choice (which is objectively a good thing), it's also our responsibility to help end users understand it.

For context, a design principle we've tried to follow is to avoid fragmenting the end user experience. This presents a couple of tensions:

  • Consistency of your app across operating systems vs. consistency with the conventions of the operating system. (We've favored the former, POSIX conventions, and case sensitivity.)

  • Consistency across apps for things like parsing behaviors, help conventions, and CLI grammar vs. customization of all of those things. (We've chosen what we think are sensible defaults and provided many places to customize, often driven by the goal of enabling people to move older apps onto System.CommandLine without forcing breaking changes.)

jonsequitur avatar May 20 '22 02:05 jonsequitur

@jonsequitur Thank you for the context. I really do appreciate how seriously the team has thought about it, and the desire to maintain consistency.

I can imagine a nuisance if, say, one team member / open-source contributor on Windows writes a script using my tool, expecting it to be case-insensitive, and then another team member / contributor runs that script on Linux and it fails because an option has the wrong casing. Not ideal, but also not a major blocker.

The one assumption you listed that I'll challenge is

it's also our responsibility to help end users understand it.

I hadn't considered that, and I love that you've all thought about it, and that you want to help with it. Just so you and the rest of the team knows: I feel like it's my responsibility for my end users, not yours. I'm willing to take the risk of having my users blame me if they don't like my UX. I don't expect they'll blame System.CommandLine, if they're even aware that it's part of the solution.

ScottArbeit avatar May 20 '22 02:05 ScottArbeit