nushell icon indicating copy to clipboard operation
nushell copied to clipboard

`str replace` should use plain string by default

Open kubouch opened this issue 3 years ago • 11 comments

Describe the bug

Currently, str replace uses regex by default which might be surprising, e.g., on Windows in cases like 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'.

How to reproduce

> 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
Error: nu::shell::unsupported_input (link)

  × Unsupported input
   ╭─[entry #4:1:1]
 1 │ 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
   ·                      ─────┬───── ─────┬────
   ·                           │           ╰── input type: value originates from here
   ·                           ╰── Parsing error at position 4: Invalid hex escape
   ╰────

Expected behavior

foo\kubouch

The solution is to use the --string flag but I think it should be the other way: String by default and regex with --regex flag

Screenshots

No response

Configuration

key value
version 0.75.1
branch main
commit_hash f4bd78b86d441cd53fb6ff30d3a6423735a423cc
build_os windows-x86_64
build_target x86_64-pc-windows-msvc
rust_version rustc 1.66.1 (90743e729 2023-01-10)
rust_channel 1.66.1-x86_64-pc-windows-msvc
cargo_version cargo 1.66.1 (ad779e08b 2023-01-10)
pkg_version 0.75.1
build_time 2023-02-11 12:40:58 +02:00
build_rust_channel release
features database, dataframe, default, trash, which, zip
installed_plugins

Additional context

No response

kubouch avatar Feb 11 '23 11:02 kubouch

I wonder if this is related to @bobhy's recent PR https://github.com/nushell/nushell/pull/7883 ?

fdncred avatar Feb 11 '23 13:02 fdncred

I bet it is! Let me see what I can do there.

bobhy avatar Feb 11 '23 21:02 bobhy

My change didn't pay proper attention to the single quotes. Ill have a fix tonight. You should be able to work around by changing string to double quotes and escaping the backsplash, like "C:\Users" in the meantime.

bobhy avatar Feb 11 '23 22:02 bobhy

Both

'C:\Users\kubouch' | str replace `C:\Users` 'foo'

and

'C:\Users\kubouch' | str replace "C:\\Users" 'foo'

lead to the same error.

kubouch avatar Feb 11 '23 23:02 kubouch

But this works:

〉'C:\Users\kubouch' | str replace 'C:\\Users' 'foo'
foo\kubouch

I'm working on it...

bobhy avatar Feb 11 '23 23:02 bobhy

... and failing at it.

I don't think this is my bug after all.

A careful reading of the error as reported above shows it is a ShellError::UnsupportedInput which the parser code never raises. Further, the detail text 'Invalid hex escape' isn't how I phrased it in the errors I do raise.

I can't find the phrase 'Invalid hex escape' or subsets of it anywhere in the sources. I guess it must be constructed in a format expression, but I can't find that either.

In some additional testing, I find that the error only repros on the first positional, not the second, and not for the 'path join' command, which also takes 2 string positionals.

〉 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
Error: nu::shell::unsupported_input (link)

  × Unsupported input
   ╭─[entry #1:1:1]
 1 │  'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
   ·                       ─────┬───── ─────┬────
   ·                            │           ╰── input type: value originates from here
   ·                            ╰── Parsing error at position 4: Invalid hex escape
   ╰────

--------------------------------------------
〉 'C:\Users\kubouch' | str replace 'foo' 'C:\Users'
C:\Users\kubouch
--------------------------------------------
〉 'C:\Users\kubouch' | path join 'C:\Users' 'foo'
C:\Users\kubouch/C:\Users/foo
--------------------------------------------
〉 'C:\Users\kubouch' | path join 'foo' 'C:\Users'
C:\Users\kubouch/foo/C:\Users

I'll continue to poke at it, but would appreciate more help.

bobhy avatar Feb 12 '23 01:02 bobhy

... and just like that, a break in the case. The ::UnsupportedInput is raised in crates/nu-command/src/strings/str_/replace.rs:212 based on an error returned from Regex::new(find_str) at :183. I see that find_str going in has value "C:\Users" as reported in the debugger, so it will indeed look like an invalid hex escape in the regex pattern.

Did we change the default of parameter --string in str replace recently? Currently, it defaults to "use regex".

〉 'C:\Users\kubouch' | str replace --string 'C:\Users' 'foo'
foo\kubouch

Works. I'd to close this bug as "by design", but that seems rude.

bobhy avatar Feb 12 '23 01:02 bobhy

This was hard to debug! We can improve the error to make Nushell nicer.

〉 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
Error: nu::shell::incorrect_value (link)

  × Incorrect value.
   ╭─[entry #1:1:1]
 1 │  'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
   ·                                   ─────┬────
   ·                                        ╰── Regex error: Parsing error at position 4: Invalid hex escape
   ╰────

I am amazed that there was no existing 'invalid argument' to use here. There are bound to be lots of places ::IncorrectValue sould be used instead of ::UnsupportedInput, and the latter should be renamed to ::UnsupportedType to reduce the ambiguity.

Anyway, I'll submit this use of it as a PR for your consideration.

bobhy avatar Feb 12 '23 02:02 bobhy

Ah, I didn't know it's regex by default. IMO it should be plain string instead of regex by default. I'll change the issue description. Thanks for looking into it @bobhy!

kubouch avatar Feb 12 '23 13:02 kubouch

I'm a Windows user too, so I'm accustomed to making many accommodations to its path syntax, but I'd hate to reduce Nu capabilities just to cater to that scenario. Nushell on Windows should support POSIX paths, like Python and Powershell do (but, unlike the latter, should not revert to backslashes when doing command line completion).

Deciding which way the str replace default should go is a hard choice. Either way, you know there will be a substantial minority of users disadvantaged.

Perhaps better would be to define an additional subcommand instead, e.g, str substitute. That way each community gets a clean command to focus on, nobody has to lose.

bobhy avatar Feb 12 '23 15:02 bobhy

Btw, one thought I had for a more general solution was to introduce a way to create path objects. Then when a command receives such an object it can either handle it correctly or error and suggest the correct flag.

JT suggested something like p'C\Users' to make a path object (my initial suggestion was path 'C:\Users').

ChrisDenton avatar Feb 12 '23 19:02 ChrisDenton