usethis icon indicating copy to clipboard operation
usethis copied to clipboard

Improving `use_spell_check()` to address R CMD check error "Possibly misspelled words in DESCRIPTION"

Open coatless opened this issue 2 years ago • 6 comments

Brief description of the problem

The usethis::use_spell_check() option creates a WORDLIST that is interpreted by only the {spelliing} package. Unfortunately, CRAN does not use the {spelling} package as part of the check. So, any words added to the custom dictionary provided by WORDLIST will come up as a dreaded note:

  • checking CRAN incoming feasibility ... NOTE Possibly misspelled words in DESCRIPTION:

Instead, they prefer the approach listed in ?"aspell-utils" that uses the .aspell top-level directory. That is, they check inside the .aspell for the inclusion of defaults.R alongside a user-defined RDS. A sample structure would look like:

├── .aspell
│   ├── defaults.R
│   ├── zspelling.rds
│   └── words.pws
├── DESCRIPTION
├── NAMESPACE
├── R
│   └── zspelling-package.R
├── inst
│   └── WORDLIST
├── man
│   └── zspelling-package.Rd
└── zspelling.Rproj

We can see where Kurt Hornik also used a similar structure: {relations} and {clue}. Dirk wrote this up a while back and, then, implemented it in {RcppArmadillo}. For a more recent addition, please see https://github.com/tmsalab/edmdata/commit/45b4ec6760c36ded7ffa6e356877dc8dcae49bd0.

With this being said, the workflow used is not documented in Writing R Extensions nor CRAN Repository Policies. Though, we can implement the desired functionality by adding a pre-commit hook to regenerate the package dictionary RDS when WORDLIST changes (i.e., in the case of the example above, zspelling.rds would be rebuilt). However, this technique only works with R versions starting from 3.0.0 onwards per the R 3.0.0 News Release Note:

aspell() gains filters for R code, Debian Control Format and message catalog files, and support for R level dictionaries. In addition, package utils now provides functions aspell_package_R_files() and aspell_package_C_files() for spell checking R and C level message strings in packages.

(Note, the lowest level of support for usethis is tagged to R 3.6.* at the moment.)

To support earlier versions, there would need to be a words.pws file generated and placed under .aspell/ directory. In particular, the .pws file is generated using an external dependency on either aspell, hunspell, or ispell; but, I'm not sure this is needed as the flat version would look like:

personal_ws-1.1 en 4 UTF-8
et
al

Perhaps only the metadata line would need to be prepended to WORDLIST?

coatless avatar Jul 24 '21 23:07 coatless

However, this technique only works with R versions starting from 3.0.0 onwards... Note, the lowest level of support for usethis is tagged to R 3.6.* at the moment.

Currently, usethis has R (>= 3.2) in DESCRIPTION:

https://github.com/r-lib/usethis/blob/0281a158f1b8280c048774e02010e2db7892b472/DESCRIPTION#L30

but that's just because we haven't rolled / advanced this version recently to reflect our standard policy re: old R versions. According to that, the real current commitment is to support R >= 3.4. So I don't think we have any concerns around R version.

The bigger question is whether we want to get into the business of managing this .aspell dictionary. Thoughts @Hadley @malcolmbarrett?

For the record let me also note it's currently a bit ambiguous whether spelling activities are in scope for devtools or usethis, as we already have devtools::spell_check() and usethis::use_spell_check().

jennybc avatar Aug 17 '21 22:08 jennybc

I think this feature would be useful enough to be worth investing some time in — it would be so nice to be able to include words in the package description without having to surround them in '.

hadley avatar Aug 18 '21 12:08 hadley

So would we change the meaning of usethis::use_spell_check()? I.e. favour the .aspell approach over the other WORDLIST? Or just start managing the same conceptual wordlist such that it exists in forms that satisfy both the spelling package and CRAN?

jennybc avatar Aug 18 '21 15:08 jennybc

I think we can make use_spell_check() do both.

hadley avatar Aug 18 '21 16:08 hadley

I started experimenting with this after bumping into said dreaded note for a new package, but I wonder if it's better to have the spelling package handle this? That way, every time the word list is updated, .aspell can be dynamically updated. I tried to update it on the fly based on what's currently in WORDLIST but winbuilder didn't seem to like that.

malcolmbarrett avatar Oct 05 '21 22:10 malcolmbarrett

See https://ropensci.org/blog/2022/01/21/ropensci-news-digest-january-2022/#to-quote-or-not-to-quote-non-existing-words-in-description for a related blog post.

CC @maelle.

krlmlr avatar Feb 03 '23 19:02 krlmlr