stringr
stringr copied to clipboard
R stringr unexpected behavior with str_replace & str_pad. Bug or Layer-8 problem?
Hi, I am using R 4.1.3 and the stringr-package 1.4.0 and get some unexpected results from this code:
stringr::str_replace(string = "5",
pattern = "([0-9]+)",
replacement = stringr::str_pad(string = "\\1", width = 3, side = "left", pad = "0"))
Expected: "005"; Result: "05".
All the parts generate the expected results:
(1) The padding
stringr::str_pad(string = "5", width = 3, side = "left", pad = "0")
Returns "005"
(2) The regex match
stringr::str_replace(string = "5", pattern = "([0-9]+)", replacement = "\\1")
Returns "5".
Only the combination of these two leads to unexpected behavior.
Using an anonymous function works, too:
stringr::str_replace(string = "5",
pattern = "([0-9]+)",
replacement = {\(x) stringr::str_pad(string = x, width = 3, side = "left", pad = "0")})
It seems that "\1" refers to the content of the capture group, but the character length is determined from the literal "\1".
stringr::str_replace(string = "5",
pattern = "([0-9]+)",
replacement = {\(x) as.character(nchar(x))})
stringr::str_replace(string = "5",
pattern = "([0-9]+)",
replacement = as.character(nchar("\\1")))
These two examples return "1" and "2". The second example always returns "2" as replacement for the captured group, independend of its content.
Is this intended behavior and am I using the package wrong or did I stumble across an actual issue of the package?
Thx for developing and maintaining such a great library.
For reference: I originally asked this on Stackoverflow.
That all looks ok to me — str_pad()
can't know what \\1
is, so you have to use it as a replacement function.