rcmdcheck icon indicating copy to clipboard operation
rcmdcheck copied to clipboard

FR: More useful parsing of output by parse_check()

Open MichaelChirico opened this issue 3 years ago • 4 comments

Currently, parse_check() only provides some rudimentary parsing of the 00check.log, e.g. a NOTE just becomes the full string found in the log, e.g.

writeLines(check$notes)

output:

checking if this is a source package ... NOTE
Found the following apparent object files/libraries:
  inst/libs/pkg.so
Object files/libraries should not be included in a source package.

checking dependencies in R code ... NOTE
Packages in Depends field not imported from:
  ‘int64’ ‘rcmdcheck’
  These packages need to be imported from (in the NAMESPACE file)
  for when this namespace is loaded but not attached.
Unexported object imported by a ':::' call: ‘data.table:::.shallow’
  See the note in ?`:::` about the use of this operator.

checking R code for possible problems ... NOTE
WritePkg.list: no visible binding for global variable ‘is’
Undefined global functions or variables:
  is
Consider adding
  importFrom("methods", "is")
to your NAMESPACE file (and ensure that your DESCRIPTION Imports field contains 'methods').

I think this would be more useful if we did better text parsing of the NOTEs, e.g. this could be a data.frame (or tibble/etc) with two columns: (1) source of note (here, 'checking if this is a source package', 'checking dependencies in R code', 'checking R code for possible problems', something like that), (2) a list column with content describing the note (here, [[1]] 'apparent object file', 'inst/libs/pkg.so' [[2]][[1]] 'Depends not imported', 'int64', 'rcmdcheck' [[2]][[2]] '::: call', 'data.table:::.shallow' [[3]] 'Undefined global function', 'is')

On the downside: (1) tools/R/check.R is huge, so the potential amount of parsing code to be written is also huge; and (2) it might be fragile to write text parsing as the check.R code changes over time.

For that, I'd say -- only writing parsers for the most common cases would still be great.

MichaelChirico avatar Jan 05 '22 23:01 MichaelChirico

I agree that this would be great, but yeah, this does seem like a lot of maintenance work. So at this point I would say that it is unlikely that we will do this.

gaborcsardi avatar Jan 06 '22 08:01 gaborcsardi

FWIW we found tools::check_packages_in_dir_details() at least does the step of peeling off the "check type" as the Check column. that part at least would be easy... it's parsing the Output column that's the real bear

MichaelChirico avatar Jan 06 '22 08:01 MichaelChirico

Maybe, but we would need to double check even tools::check_packages_in_dir_details() because it only works for the current R version for sure, whereas our parser should be able to parse output from other versions.

gaborcsardi avatar Jan 06 '22 09:01 gaborcsardi

It looks like that Check column is very simple text processing...

checking if this is a source package ... NOTE --> Check = "if this is a source package" checking dependencies in R code ... NOTE --> Check = "dependencies in R code" checking R code for possible problems ... NOTE --> Check = "R code for possible problems"

Which should not depend much on the version

MichaelChirico avatar Jan 06 '22 18:01 MichaelChirico