stringr
stringr copied to clipboard
Add new `str_dedent` function
One incredibly helpful function in Python is the textwrap.dedent function. Under the hood, this function uses regex to strip any leading spaces, while maintaining any internal indentation within a chunk of code.
This addition here re-implements the same functionality using native R code.
I've ensured to include 4 different unit tests for the same.
I'd also like to have a dedent() function in R. Two comments:
-
Wouldn't it make sense to trim leading and trailing whitespaces in the output or add an argument to do so? That would be very useful when making character strings from multi-line text in R:
str_dedent(" This is a long sentence that starts on the first line, continues on the second, and ends on the third one. ")For readability purposes, it's better to start the text on its own line here. We wouldn't want to keep that blank line at the beginning though. Passing the output to
str_trim()every time in that situation would be cumbersome. -
The function currently doesn't support the following situation (when python's
textwrap.dedent()does):str_dedent(" foo bar ")It's probably uncommon enough not to support it but that could be detailed in the function documentation.
Like @lionel-, I think I am also surprised by the trailing \n here
library(stringr)
library(glue)
glue_chr <- function(...) {
unclass(glue(...))
}
str_dedent("
Line 1
Line 2
Line 3
")
#> [1] "Line 1\nLine 2\nLine 3\n"
glue_chr("
Line 1
Line 2
Line 3
")
#> [1] "Line 1\nLine 2\nLine 3"
I would have expected the above to give this output
str_dedent("
Line 1
Line 2
Line 3")
#> [1] "Line 1\nLine 2\nLine 3"
glue_chr("
Line 1
Line 2
Line 3")
#> [1] "Line 1\nLine 2\nLine 3"
I think an invariant of this function could be:
Strips all leading and trailing whitespace from the output
which provides a nice symmetry and nice user experience
@DavisVaughan you mean "Strips all leading and trailing whitespace lines from the output" right?
And you both really think we don't want a trailing new line? If you were going to cat() this you would want a trailing \n?
First, I'll just add some bits from glue's documentation for reference:
Empty first and last lines are automatically trimmed, as is leading whitespace that is common across all lines. ... If you want an explicit newline at the start or end, include an extra empty line. ... Leading and trailing whitespace from the first and last lines is removed.
A uniform amount of indentation is stripped from the second line on, equal to the minimum indentation of all non-blank lines after the first.
As for this:
And you both really think we don't want a trailing new line? If you were going to
cat()this you would want a trailing\n?
If you're just cat()ing the result of 1 str_dedent() call, I think it doesn't matter because the R console basically adds the trailing newline. And in more complicated situations, this is when I'd probably use cli::cat_line() anyway.
@jennybc that's only true if it's at the top-level (i.e. if you cat multiple times before returning, you need newlines in between them).
I'm not sure why I'm so far apart on trailing newlines than the rest of you. I thought this was a situation where preserving them was "obviously correct".
if you cat multiple times before returning, you need newlines in between them)
I guess that's when I would use cli::cat_line().
We're talking a lot about glue, which stringr imports. Which makes me wonder ... why isn't str_dedent() just glue::trim()? 🤔
That is a good question. If I replace the existing implementation with a direct call to glue::trim() then I get the following failures:
── Failure ([test-remove.R:12:3](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#)): strips common ws ──────────────────────────────────────────────────────────────────────────────
str_dedent(" Hello\n World") (`actual`) not equal to "Hello\n World" (`expected`).
`lines(actual)`: "Hello" "World"
`lines(expected)`: "Hello" " World"
── Failure ([test-remove.R:13:3](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#)): strips common ws ──────────────────────────────────────────────────────────────────────────────
str_dedent(" Hello\n World") (`actual`) not equal to " Hello\nWorld" (`expected`).
`lines(actual)`: "Hello" "World"
`lines(expected)`: " Hello" "World"
── Failure ([test-remove.R:25:3](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#)): preserves final newline ───────────────────────────────────────────────────────────────────────
str_dedent(" Hello\n World\n") (`actual`) not equal to "Hello\nWorld\n" (`expected`).
`lines(actual)`: "Hello" "World"
`lines(expected)`: "Hello" "World" ""
── Failure ([test-remove.R:35:3](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#)): preserves final newline ───────────────────────────────────────────────────────────────────────
str_dedent("\n Hello\n World\n ") (`actual`) not equal to "Hello\nWorld\n" (`expected`).
`lines(actual)`: " Hello" " World"
`lines(expected)`: "Hello" "World" ""
I can make most of them go away by adding a leading \n to the strings (which better reflects real use), but the remaining weirdness is this:
cat(
glue::trim("
Hello
World
")
)
#> Hello
#> World
Created on 2025-09-24 with reprex v2.1.1
I find the extra indent here surprising. But maybe we could fix that?
If you're just cat()ing the result of 1 str_dedent() call, I think it doesn't matter because the R console basically adds the trailing newline
The Positron and RStudio consoles do (at top-level as Hadley mentions), but not the R console:
If you were going to cat() this you would want a trailing \n?
I would expect the caller of cat() to add the trailing \n. But I think that's a case where I'd pipe the output to writeLines() (or use cat_line() as Jenny suggests).