rootstock icon indicating copy to clipboard operation
rootstock copied to clipboard

Spellcheck improvements

Open agitter opened this issue 4 years ago • 2 comments

#333 introduced the first version of a spellchecker, but there are several remaining areas for improvement.

GitHub Actions: ✔️ We should be able to run the spellcheck filter in GitHub Actions as well and print the misspelled words and their locations in the build log. In my initial attempt, the spellcheck.lua file downloaded in an earlier step in the workflow was not available in the "Build Manuscript" step. I'll need to debug that.

AppVeyor message: ✔️ AppVeyor's pull request comment should only use the head of the spelling errors file.

Customizing the language: We could use the language specified in metadata.lang to determine which Aspell dictionary to download and use as the master dictionary. See https://github.com/manubot/rootstock/pull/333#issuecomment-619259591.

Discrepancies between spellcheck filter and grep: Once this is used with real manuscripts, we should watch for different behaviors between the spellcheck filter and grep call used to locate misspelled words. The filter is run on output.md but grep runs on the input *.md files. Template variables and other details are handled differently. In this build "CDC’s" is identified as a misspelled word, but the grep search does not detect it because the original markdown file contains "CDC's" (different apostrophe character).

agitter avatar Apr 29 '20 03:04 agitter

I read through the lua filter code, and it does detect the languages of the Pandoc document and pass them to aspell with the -l argument. Therefore, to support multiple languages, we should only need to install the correct dictionaries.

agitter avatar May 01 '20 19:05 agitter

Make case sensitive an option. Currently requires editing two lines: https://github.com/manubot/rootstock/blob/master/build/build.sh#L90 and https://github.com/manubot/rootstock/blob/master/build/build.sh#L112

In the first case setting ignore-case to false. In the second removing the -i flag from the grep command.

cgreene avatar May 19 '20 14:05 cgreene