lintr
lintr copied to clipboard
Turn off line length linter within string literals (optionally?)
We've got a very common scenario where string literals are used to contain big SQL queries.
Crucially, the style guide for SQL & R differ in that the former's max line width is 100, the latter 80.
Moreover, to skip only one such line we've got to somewhat awkwardly inject the nolint tag inside the SQL:
query <- "
SELECT SUM(CASE WHEN some_column > 5 THEN 5 ELSE some_column END) AS output_column -- # nolint
"
It seems to me the best solution here is to simply ignore string literal "bodies", perhaps with an option.
Happy to implement this if others agree.
Not sure if we should, but we have a similar problem with long machine-readable comments, so it would be nice to also control linting of comment lines.
What should qualify a line for exclusion in both cases? E.g. what if a string / comment starts past the max line length?
by "bodies" i meant to exclude the first and last line, e.g.
x <- " # included for linting
abcdef # excluded from linting
ghijkl # excluded from linting
" # included for linting
So it seems like we could add two options to the linter:
(1) exclude_comments = FALSE (by default) (2) exclude_string_bodies = FALSE (by default)
Giving this a bit more thought, one use case with the string line length issue is with very long paths, where adding newlines to the string is not an option. Therefor I feel we should exclude the string expression completely, i.e. only lint line_length if new code starts past the end of a long string or is too long in the line before the string starts.
To clarify what I mean:
# lints:
read_csv("very_long_path.csv", col_types = cols(.default = col_character()))
read_csv(col_types = cols(very_long_cols_declaration), file = "short_string")
# not lints:
read_csv("very_long_path.csv")
read_csv(
file = "very_long_path.csv",
col_types = cols(
very_long_cols_declaration
)
)
IINM this behavior could be achieved by treating every string literal as length 0
We do have loads of long file names, but usually it's because of the leading directories, and that can be solved using file.path()
.
OTOH, we also have a lot of SQL table names / nested column structures accessed with .
that more awkward to split using paste0
or paste(sep = '.')
.
Related: for me the place where the line_length always bites me is with urls. Not flagging a line that's just a single string with no spaces (or maybe detecting urls specifically?) would be a nice feature.
I often find myself doing something like:
combined_sd <- function(mean, sd, n) {
# based on
# https://math.stackexchange.com/questions/2971315/how-do-i-combine-standard-deviations-of-two-groups # nolint
...
}
NB for Stackexchange specifically, you can shorten the URL drastically:
https://math.stackexchange.com/questions/2971315/how-do-i-combine-standard-deviations-of-two-groups
Is equivalent to
https://math.stackexchange.com/q/2971315
The no spaces heuristic breaks down for long paths that contain embedded spaces. That said, it seems very hard to figure out a good heuristic which would neither cause many false-positives nor too many false negatives.
# not lints: read_csv("very_long_path.csv") read_csv( file = "very_long_path.csv", col_types = cols( very_long_cols_declaration ) )
IINM this behavior could be achieved by treating every string literal as length 0
Hmm, I might lint in the first case, requiring the string literal to be "alone" on the line to be exempted:
read_csv(
"very_long_path.csv"
)
"alone" because the latter case, where the string is the target of an EQ_SUB
, would be OK since forcing this is somewhat awkward:
read_csv(
file =
"very_long_path.csv"
)
Note to self that we'll also have to exclude the OP-COMMA
on the same line, if present.
Related: for me the place where the line_length always bites me is with urls. Not flagging a line that's just a single string with no spaces (or maybe detecting urls specifically?) would be a nice feature.
I often find myself doing something like:
combined_sd <- function(mean, sd, n) { # based on # https://math.stackexchange.com/questions/2971315/how-do-i-combine-standard-deviations-of-two-groups # nolint ... }
@d-sci ICYMI you can now (in dev & soon on CRAN) switch to # nolint next
if that's better for you:
# nolint next: line_length_linter.
# https://...