rootstock
rootstock copied to clipboard
Spellcheck improvements
#333 introduced the first version of a spellchecker, but there are several remaining areas for improvement.
GitHub Actions: ✔️
We should be able to run the spellcheck filter in GitHub Actions as well and print the misspelled words and their locations in the build log. In my initial attempt, the spellcheck.lua
file downloaded in an earlier step in the workflow was not available in the "Build Manuscript" step. I'll need to debug that.
AppVeyor message: ✔️ AppVeyor's pull request comment should only use the head of the spelling errors file.
Customizing the language:
We could use the language specified in metadata.lang
to determine which Aspell dictionary to download and use as the master dictionary. See https://github.com/manubot/rootstock/pull/333#issuecomment-619259591.
Discrepancies between spellcheck filter and grep:
Once this is used with real manuscripts, we should watch for different behaviors between the spellcheck filter and grep call used to locate misspelled words. The filter is run on output.md
but grep runs on the input *.md
files. Template variables and other details are handled differently. In this build "CDC’s" is identified as a misspelled word, but the grep search does not detect it because the original markdown file contains "CDC's" (different apostrophe character).
I read through the lua filter code, and it does detect the languages of the Pandoc document and pass them to aspell with the -l
argument. Therefore, to support multiple languages, we should only need to install the correct dictionaries.
Make case sensitive an option. Currently requires editing two lines: https://github.com/manubot/rootstock/blob/master/build/build.sh#L90 and https://github.com/manubot/rootstock/blob/master/build/build.sh#L112
In the first case setting ignore-case
to false. In the second removing the -i
flag from the grep command.