`str replace` should use plain string by default
Describe the bug
Currently, str replace uses regex by default which might be surprising, e.g., on Windows in cases like 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'.
How to reproduce
> 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
Error: nu::shell::unsupported_input (link)
× Unsupported input
╭─[entry #4:1:1]
1 │ 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
· ─────┬───── ─────┬────
· │ ╰── input type: value originates from here
· ╰── Parsing error at position 4: Invalid hex escape
╰────
Expected behavior
foo\kubouch
The solution is to use the --string flag but I think it should be the other way: String by default and regex with --regex flag
Screenshots
No response
Configuration
| key | value |
|---|---|
| version | 0.75.1 |
| branch | main |
| commit_hash | f4bd78b86d441cd53fb6ff30d3a6423735a423cc |
| build_os | windows-x86_64 |
| build_target | x86_64-pc-windows-msvc |
| rust_version | rustc 1.66.1 (90743e729 2023-01-10) |
| rust_channel | 1.66.1-x86_64-pc-windows-msvc |
| cargo_version | cargo 1.66.1 (ad779e08b 2023-01-10) |
| pkg_version | 0.75.1 |
| build_time | 2023-02-11 12:40:58 +02:00 |
| build_rust_channel | release |
| features | database, dataframe, default, trash, which, zip |
| installed_plugins |
Additional context
No response
I wonder if this is related to @bobhy's recent PR https://github.com/nushell/nushell/pull/7883 ?
I bet it is! Let me see what I can do there.
My change didn't pay proper attention to the single quotes. Ill have a fix tonight. You should be able to work around by changing string to double quotes and escaping the backsplash, like "C:\Users" in the meantime.
Both
'C:\Users\kubouch' | str replace `C:\Users` 'foo'
and
'C:\Users\kubouch' | str replace "C:\\Users" 'foo'
lead to the same error.
But this works:
〉'C:\Users\kubouch' | str replace 'C:\\Users' 'foo'
foo\kubouch
I'm working on it...
... and failing at it.
I don't think this is my bug after all.
A careful reading of the error as reported above shows it is a ShellError::UnsupportedInput which the parser code never raises. Further, the detail text 'Invalid hex escape' isn't how I phrased it in the errors I do raise.
I can't find the phrase 'Invalid hex escape' or subsets of it anywhere in the sources. I guess it must be constructed in a format expression, but I can't find that either.
In some additional testing, I find that the error only repros on the first positional, not the second, and not for the 'path join' command, which also takes 2 string positionals.
〉 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
Error: nu::shell::unsupported_input (link)
× Unsupported input
╭─[entry #1:1:1]
1 │ 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
· ─────┬───── ─────┬────
· │ ╰── input type: value originates from here
· ╰── Parsing error at position 4: Invalid hex escape
╰────
--------------------------------------------
〉 'C:\Users\kubouch' | str replace 'foo' 'C:\Users'
C:\Users\kubouch
--------------------------------------------
〉 'C:\Users\kubouch' | path join 'C:\Users' 'foo'
C:\Users\kubouch/C:\Users/foo
--------------------------------------------
〉 'C:\Users\kubouch' | path join 'foo' 'C:\Users'
C:\Users\kubouch/foo/C:\Users
I'll continue to poke at it, but would appreciate more help.
... and just like that, a break in the case.
The ::UnsupportedInput is raised in crates/nu-command/src/strings/str_/replace.rs:212 based on an error returned from Regex::new(find_str) at :183. I see that find_str going in has value "C:\Users" as reported in the debugger, so it will indeed look like an invalid hex escape in the regex pattern.
Did we change the default of parameter --string in str replace recently? Currently, it defaults to "use regex".
〉 'C:\Users\kubouch' | str replace --string 'C:\Users' 'foo'
foo\kubouch
Works. I'd to close this bug as "by design", but that seems rude.
This was hard to debug! We can improve the error to make Nushell nicer.
〉 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
Error: nu::shell::incorrect_value (link)
× Incorrect value.
╭─[entry #1:1:1]
1 │ 'C:\Users\kubouch' | str replace 'C:\Users' 'foo'
· ─────┬────
· ╰── Regex error: Parsing error at position 4: Invalid hex escape
╰────
I am amazed that there was no existing 'invalid argument' to use here. There are bound to be lots of places ::IncorrectValue sould be used instead of ::UnsupportedInput, and the latter should be renamed to ::UnsupportedType to reduce the ambiguity.
Anyway, I'll submit this use of it as a PR for your consideration.
Ah, I didn't know it's regex by default. IMO it should be plain string instead of regex by default. I'll change the issue description. Thanks for looking into it @bobhy!
I'm a Windows user too, so I'm accustomed to making many accommodations to its path syntax, but I'd hate to reduce Nu capabilities just to cater to that scenario. Nushell on Windows should support POSIX paths, like Python and Powershell do (but, unlike the latter, should not revert to backslashes when doing command line completion).
Deciding which way the str replace default should go is a hard choice. Either way, you know there will be a substantial minority of users disadvantaged.
Perhaps better would be to define an additional subcommand instead, e.g, str substitute. That way each community gets a clean command to focus on, nobody has to lose.
Btw, one thought I had for a more general solution was to introduce a way to create path objects. Then when a command receives such an object it can either handle it correctly or error and suggest the correct flag.
JT suggested something like p'C\Users' to make a path object (my initial suggestion was path 'C:\Users').