precommit icon indicating copy to clipboard operation
precommit copied to clipboard

Feature: allow package-loading prior to lint hook

Open russHyde opened this issue 3 years ago • 18 comments

Is your feature request related to a problem? Please describe. When running the lintr hook on a package, it flags many object_usage_linter errors. This happens because, unless the package is loaded, lintr is unable to distinguish when a referenced object is completely undefined from when it is defined within the package. Typically, we would lint the whole package in CI but use pre-commit to only lint those files in a commit set. We could turn off the object_usage_linter in the .lintr config, but that config affects both precommit and CI runs of lintr, so would mean that this useful linter would be unavailable in CI.

Describe the solution you'd like Would it be possible to add a command line option to the pre-commit linting script that determines whether the package in a repo should be loaded prior to running lintr. For example, Rscript lintr.R <files> would run as currently implemented. But Rscript lintr.R --load-package <files> would run pkgload::load_all() before running lintr on the files.

russHyde avatar Aug 12 '22 10:08 russHyde

Thanks @russHyde. What about the option to lint the whole package instead of the staged files? Is that also an option? I prefer to not use pkgload::load_all() because that requires all dependencies of the package under linting to be in the same {renv} where {precommit} is installed...

lorenzwalthert avatar Aug 12 '22 12:08 lorenzwalthert

That's interesting, I hadn't considered the environment where precommit/lintr is running. This might be something better handled on our (lintr's) end - for example, having a way to selectively deactivate linters based on environment variables, so that some linters are only active in CI.

russHyde avatar Aug 12 '22 13:08 russHyde

Is there smart caching built in in {lintr}, or is running it on all files significantly slower?

lorenzwalthert avatar Aug 12 '22 14:08 lorenzwalthert

any thoughts @MichaelChirico?

lorenzwalthert avatar Aug 13 '22 17:08 lorenzwalthert

lintr is typically taking O(10 seconds) on full packages, but yes there is an expression-level caching mechanism

MichaelChirico avatar Aug 13 '22 20:08 MichaelChirico

having a way to selectively deactivate linters based on environment variables, so that some linters are only active in CI.

I think this sounds like introducing intransparency. Another option that we considered in #393 is to use language: script and run the hook in the global R environment. However, this also means that the check can't be run in pre-commit.ci (which is fine for that specific hook, but not for lintr hooks). Alternatively, we could have a hook running locally (and skipped on ci) and one running only on CI (but skipped locally). However, these are all workarounds for me.

lintr is typically taking O(10 seconds) on full packages, but yes there is an expression-level caching mechanism

If you cache files, would it get significantly faster? Then that would be the solution in my eyes (offloading my own work to others has always been my easy way out 😄).

lorenzwalthert avatar Nov 12 '22 18:11 lorenzwalthert

Also note that theoretically, {lintr} does not have to offer the functionality to deactivate linters based on environment variables, even if we wanted to use env variables (or pre-commit arguments in .pre-commit-config.yaml). Because in our hook script, we could also load the {lintr} config, modify it and then call {lintr} with passing the modified contents explicitly as config (I guess this should work). Or just make behaviour different for local vs cloud execution (but language is fix), but that also is not very transparent to me.

lorenzwalthert avatar Nov 12 '22 18:11 lorenzwalthert

I agree that the env-variables make it less transparent. I might experiment with writing a lintr hook that allows the user to specify a lintr config, so that I can have a .lintr.pre-commit (for use in local pre-commit) and a .lintr config (for use interactively and in CI)

russHyde avatar Nov 14 '22 10:11 russHyde

Yeah ok. Here's how I'd do it.

  • For local, you could also use language: script to use the global R library (so forget about additional_dependencies:). devtool::load_all() and then lint only files to be committed.
  • For remote, use language: r that uses another config file and runs lintr on the whole package. Since it's impossible to skip hooks locally and only run remotely IIRC, you could at the top of this script check if CI env variable is set (I believe it's set in pre-commit.ci) and skip the remainder.

lorenzwalthert avatar Nov 14 '22 11:11 lorenzwalthert

For local, you could also use language: script to use the global R library (so forget about additional_dependencies:). devtool::load_all() and then lint only files to be committed.

@lorenzwalthert I'm not familiar enough with the setup of .pre-commit-config.yaml to figure out how to apply the solution you propose. Do you mind clarifying that a bit? Below is my current .pre-commit-config.yaml; how should I change it to use the global R library as you suggest? (I'm only concerned with doing this locally, not on remote)

# All available hooks: https://pre-commit.com/hooks.html
# R specific hooks: https://github.com/lorenzwalthert/precommit
repos:
-   repo: https://github.com/lorenzwalthert/precommit
    rev: v0.3.2.9003
    hooks: 
    -   id: style-files
        args: [--style_pkg=styler, --style_fun=tidyverse_style]    
    -   id: roxygenize
        # roxygen requires loading pkg -> add dependencies from DESCRIPTION
        additional_dependencies:
        -    assertr
        -    assertthat
        -    digest
        -    dplyr
        -    glue
        -    methods
        -    purrr
        -    stringr
        -    tibble
        -    tidyr
        -    utils
        -    anthonynorth/roxyglobals
    # codemeta must be above use-tidy-description when both are used
    -   id: codemeta-description-updated
    -   id: use-tidy-description
    -   id: spell-check
        exclude: >
          (?x)^(
          .*\.[rR]|
          .*\.feather|
          .*\.jpeg|
          .*\.pdf|
          .*\.png|
          .*\.py|
          .*\.RData|
          .*\.rds|
          .*\.Rds|
          .*\.Rproj|
          .*\.sh|
          (.*/|)\.gitignore|
          (.*/|)\.gitlab-ci\.yml|
          (.*/|)\.lintr|
          (.*/|)\.pre-commit-.*|
          (.*/|)\.Rbuildignore|
          (.*/|)\.Renviron|
          (.*/|)\.Rprofile|
          (.*/|)\.travis\.yml|
          (.*/|)appveyor\.yml|
          (.*/|)NAMESPACE|
          (.*/|)renv/settings\.dcf|
          (.*/|)renv\.lock|
          (.*/|)WORDLIST|
          \.github/workflows/.*|
          data/.*|
          )$
    -   id: lintr
    -   id: readme-rmd-rendered
    -   id: parsable-R
    -   id: no-browser-statement
    -   id: no-debug-statement
    -   id: deps-in-desc
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.3.0
    hooks: 
    -   id: check-added-large-files
        args: ['--maxkb=200']
    -   id: file-contents-sorter
        files: '^\.Rbuildignore$'
    -   id: end-of-file-fixer
        exclude: >
          (?x)^(
          .*\.Rd|
          tests/testthat/_snaps/.*\.md|
          )$
-   repo: https://github.com/pre-commit-ci/pre-commit-ci-config
    rev: v1.5.1
    hooks:
    # Only reuiqred when https://pre-commit.ci is used for config validation
    -   id: check-pre-commit-ci-config
-   repo: local
    hooks:
    -   id: forbid-to-commit
        name: Don't commit common R artifacts
        entry: Cannot commit .Rhistory, .RData, .Rds or .rds.
        language: fail
        files: '\.(Rhistory|RData|Rds|rds)$'
        # `exclude: <regex>` to allow committing specific files

ci:
    autoupdate_schedule: monthly

joelnitta avatar Nov 16 '22 00:11 joelnitta

Like this:

-   repo: local
    hooks:
    -   id: local-lintr
        name: Run lintr with global R environment
        entry: Rscript -e "devtools::load_all(); output <- lintr::lint(commandArgs(trailingOnly=TRUE)); print(output); if (length(output) > 0) stop('Files not lint free')"
        language: system
        files: '(\.[rR]profile|\.R|\.Rmd|\.Rnw|\.r|\.rmd|\.rnw)$'
  

You could also move the R code in entry to a file and use language: script with entry: path/to/script.R where the script must be executable by the user.

lorenzwalthert avatar Nov 16 '22 13:11 lorenzwalthert

Pre-commit passes the file names to entry as unnamed arguments.

lorenzwalthert avatar Nov 16 '22 13:11 lorenzwalthert

@lorenzwalthert thanks!

However, I noticed this only works if a single R file is being committed, since lintr::lint() only works on a single file.

Below I have modified it to work on one or more R files:

-   repo: local
    hooks:
    -   id: local-lintr
        name: Run lintr with global R environment
        entry: > 
          Rscript -e "devtools::load_all();
          output <- commandArgs(trailingOnly=TRUE) |> strsplit(' ') |>
          unlist() |> lapply(lintr::lint) |> purrr::compact();
          if (length(output) > 0) {print(output); stop('Files not lint free')}"
        language: system
        files: '(\.[rR]profile|\.R|\.Rmd|\.Rnw|\.r|\.rmd|\.rnw)$'

joelnitta avatar Nov 17 '22 01:11 joelnitta

(BTW I would be interested in running this with CI also if you could provide more details on that as well)

joelnitta avatar Nov 17 '22 01:11 joelnitta

Thanks for your snipped. Also, you may want to print lilntr::lint() output, so in case the hook does not pass, the problems are displayed.

For CI, this does not work (system hooks are not supported with pre-commit.ci). Potentially later with GitHub Actions and #450, but for now, maybe you could just use the lintr action from r-lib/actions?

lorenzwalthert avatar Nov 17 '22 08:11 lorenzwalthert

It does print the output (I just ran a test to verify): {print(output); stop('Files not lint free')}

So for CI, there is no need to modify .pre-commit-config.yaml right? The CI will just ignore - repo: local?

joelnitta avatar Nov 18 '22 00:11 joelnitta

Not sure it skips the local hook by default, you could also skip it explicitly as mentioned in the pre-commit.ci docs:

ci:
    skip: [pylint]

repos:
-   repo: local
    hooks:
    -   id: pylint
        # ...

lorenzwalthert avatar Nov 18 '22 10:11 lorenzwalthert

Great you achieved what you wanted. A solution to make this work withouth local hook would still be to

  • add a flag --load-package and call pkgload::load_all() if the flag is set (no need for it to be installed in the renv if the flag is not set
  • and to require the user to list all dependencies plus {pkgload} in additional_dependencies.

Or to use language: system for the hook and let useres use pre-commit CI lite to manage dependencies (instead of renv).

lorenzwalthert avatar Jan 07 '24 10:01 lorenzwalthert