beefore
beefore copied to clipboard
Add a task to spell check documentation
We need a a Beefore task that will do a spell check of any changes to the documentation directory.
Can you elaborate it please @freakboy3742
I'm not sure what more detail you're looking for. Beefore currently has a task for doing things like linting, so it can find (and comment on) code format problems. There are tools available for spell checking ReST documentation; we should create a Beefore task to perform a spell check on any PR that touches the /docs directory.
What technology i need to know before i can on this issue @freakboy3742?
I used Peter Norvig's spell checking algorithm to perform the spell checker. However, the spell checker might work well if there exists a dictionary of words used in the document. I can submit it if you want.
Yes you can
Hello @freakboy3742 , Can I do this task? I am new to contributing open source and I want to work on this.
Hey @freakboy3742 , I am a newbie in open source world. If this issue is still open, can I work upon it?
Hey All - I'm looking at PyEnchant for this. Since nothing has been submitted yet for this, I'm researching how put this together. Feel free to beat me to it.
can i do this , Mr.?
@ivanzaqqa I beleive @thomasoflight is working on this - but if he doesn't respond in the next few days, feel free to have a go!
Hi there @ivanzaqqa. I am currently developing this feature, albeit at a first-timer pace. If there's something else that you've seen perhaps consider that one first. Cheers!
Okey thanks
On Sep 14, 2017 3:39 AM, "thomasoflight" [email protected] wrote:
Hi there @ivanzaqqa https://github.com/ivanzaqqa. I am currently developing this feature, albeit at a first-timer pace. If there's something else that you've seen perhaps consider that one first. Cheers!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pybee/beefore/issues/6#issuecomment-329289924, or mute the thread https://github.com/notifications/unsubscribe-auth/Ac29GVkwzs1XLlP_eELXm6qU_WiSptkfks5siD1ogaJpZM4OgsHi .
Hello, I was thinking of something along the lines o Peter Norvig's amazing tutorial here: http://norvig.com/spell-correct.html
@cyberdrk We don't need a solution built from scratch. There are existing libraries for Sphinx that do spell checking. What we need is a plugin that integrates one of those libraries
Hey @cyberdrk, @freakboy3742 - A while back I posted this gist which documents my initial efforts to integrate Sphinx with Beefore. @freakboy3742 do you know of any resources that can outline this process? The Sphinx documentation contains a lot of information which is challenging. @cyberdrk I'd love to work with you on this or see how you're solving it. I'm currently stepping through the Beefore codebase to see how the pieces fit together as I write a rough draft.
Hi, I'm a first time contributor but if there aren't any objections, I would like to take a crack at this issue.
I noticed that the sphinx spell checker is using a library (pyenchant) that is no longer being maintained and I didn't see another library that is as mature as that one for spell checking. How should I proceed?
@freakboy3742 Hi I just want to confirm with you what's involved in addressing this issue. To make a new, Beefore task does this involve writing a new module (pyspellingbee ?) that uses the spell checker libraries on the diffs of .rst files or other files in the docs folder.
@freakboy3742 @cyberdrk @thomasoflight @ivanzaqqa So here's what I'm thinking thus far. It seams like to add a beekeeper task for spell check, a new file (possibly named spellingbee.py) will need to be added to the beefore\checks directory. The required modules (pyenchant, maybe sphinxcontrib-spelling along with sphinx depending on the spell check implementation) will also need to be installed as well as the enchanct c lib which pyenchant is a wrapper for (also note that as of May 2018 pyenchant is no longer supported and is looking for a new supporting developer). To install these libraries, the install_requires
parameter in setup.py may need to be modified to include the appropriate modules. Another solution to this problem could be to use the ?potentially? required function def prepare(directory)
for modules in the beefore\checks directory which could possibly be used to manually install the needed c libraries and modules to run the checks. To preform the spell check one option could be to use the pyenchant lib directly on the diffs obtained from github files in the docs directory using the suggested spelling to generate terminal output to be used in correcting spelling errors in the docs. Also either the different repos that use this spell check task will need to start maintaining a dictionary file or that file will need to be added to the beefore repo in order to add exceptions to words like BeeWare and Beefore which aren't mispelled but aren't in the standard English dictionary. To test this modification, the beekeeper yaml file will need to be updated to use the spellingbee.py task. It's not clear to me what exactly needs to be changed in the beekeeper yaml file or what functions/class defs are and are not required to make the spell check task file and it seams like that a lot of that depends on how the beekeeper runs beefore. I've been looking at the beekeeper repo to find this information but the docs for it aren't very complete so I've been stepping through the beekeeper repo trying to find some of this information. At this point I'm starting to run up against the end of my vacation time so I'm not going to be able to spend as much time on this ticket as I could the last couple of days. I did find a couple of spelling errors in the docs for this repo so I'll submit a PR for that small fix (see #19). Does this description of the ticket sound reasonable? Also, given the number of files that need to be touched and that this issue seams to require some understanding of how this repo interacts with the main CI is this issue appropriate for first time contributors?
@zifn It sounds like you're on the right general track. Adding extra dependencies in install_requires
is no problem; the rest of the task is to wrap the calls necessary to start a spell checker over the documentation directory.
Sphinx has some spelling check functionality (both built in, as as plugins), covering most of the cases you've described (e.g., having a "local" dictionary of words known to be OK); I'd expect to see those native tools used for the heavy lifting (just as we use flake8 to check for code style errors). The bulk of the task is collating the spelling issues with the lines that are covered by the patch (so you don't report a spelling error with code/text that isn't actually in the patch).
There's definitely some detail that needs to be worked through, but I can't see any reason that this couldn't be tackled by a first time contributor.
Hello @freakboy3742 ! If the issue still persists, I'd love to help out :D
@devarshigoswami Yes - the issue still exists! However, it's safe to say that the scope has changed a little; we've been moving our CI infrastructure to Github Actions, and reducing our use of BeeFore and BeeKeeper.
If you wanted to try a contribution here, I'd suggest trying to add a "Spell check" step to Briefcase's CI definition. Although briefcase is using Beefore to check the Python formatting, using Beefore is not required for the spell check.
Tanks for the prompt reply! :) since I'm relatively a noob, Could you give me a little more specifics about the task? " add a "Spell check" step to Briefcase's CI definition https://github.com/beeware/briefcase/blob/master/.github/workflows/ci.yml " : does the solution need to be yaml specific since its an .yml file? do I check it against a generic dictionary or might there be errors in the code as well?
On Fri, 28 Feb 2020 at 05:00, Russell Keith-Magee [email protected] wrote:
@devarshigoswami https://github.com/devarshigoswami Yes - the issue still exists! However, it's safe to say that the scope has changed a little; we've been moving our CI infrastructure to Github Actions, and reducing our use of BeeFore and BeeKeeper.
If you wanted to try a contribution here, I'd suggest trying to add a "Spell check" step to Briefcase's CI definition https://github.com/beeware/briefcase/blob/master/.github/workflows/ci.yml. Although briefcase is using Beefore to check the Python formatting, using Beefore is not required for the spell check.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/beeware/beefore/issues/6?email_source=notifications&email_token=ANEFWPCYIFBRHY4D4SGBVZDRFBEKBA5CNFSM4DUCYHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENGLSHA#issuecomment-592230684, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEFWPDH5MEHHKFTRRYCIBLRFBEKBANCNFSM4DUCYHRA .
So the task, in the most high level, "use case" terms is this:
As a project maintainer, I want to avoid ever merging a PR that contains a change to documentation with a spelling error, or an error in markup.
The ci.yml file I referenced is the configuration of Github actions. Those are the commands that get executed every time a contributor submits a pull request. You could run the same commands yourself by hand; the configuration file gives Github enough detail to run them automatically.
So - the task really has 2 parts:
-
Work out how, from the command line, you can verify that there are no spelling errors or markup errors in the Sphinx documentation for the project
-
Work out how to configure Github Actions to invoke those commands.
The markup errors task is relatively straightforward - if you try to build the documentation, and there's a markup error, you'll find the return code of the build is non-zero - that's the usual Unix response for "this command raised an error". If a Github Action returns a non-zero return code, that build step will fail.
The spell check task will be similar - but you'll need to work out how to run a spell check over a Sphinx documentation directory (including adding/excluding words that are spelled correctly, but aren't in the spell checker's dictionary).
I'm not especially concerned about spelling errors in code - with the possible exception of function docstrings, if the project's documentation includes them.
Hey there , Russell! I believe we can solve the spelling check problem with this. maybe add apt aspell addons: apt: packages: aspell
and then add a shell script to do the checking.? script: spellcheck.sh am I in same page?
On Fri, 28 Feb 2020 at 15:59, Devarshi Goswami [email protected] wrote:
Thanks a bunch. I will try my best to work on this. :)
On Fri, 28 Feb 2020 at 13:51, Russell Keith-Magee < [email protected]> wrote:
So the task, in the most high level, "use case" terms is this:
As a project maintainer, I want to avoid ever merging a PR that contains a change to documentation with a spelling error, or an error in markup.
The ci.yml file I referenced is the configuration of Github actions. Those are the commands that get executed every time a contributor submits a pull request. You could run the same commands yourself by hand; the configuration file gives Github enough detail to run them automatically.
So - the task really has 2 parts:
Work out how, from the command line, you can verify that there are no spelling errors or markup errors in the Sphinx documentation for the project 2.
Work out how to configure Github Actions to invoke those commands.
The markup errors task is relatively straightforward - if you try to build the documentation, and there's a markup error, you'll find the return code of the build is non-zero - that's the usual Unix response for "this command raised an error". If a Github Action returns a non-zero return code, that build step will fail.
The spell check task will be similar - but you'll need to work out how to run a spell check over a Sphinx documentation directory (including adding/excluding words that are spelled correctly, but aren't in the spell checker's dictionary).
I'm not especially concerned about spelling errors in code - with the possible exception of function docstrings, if the project's documentation includes them.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/beeware/beefore/issues/6?email_source=notifications&email_token=ANEFWPGVHJP5TNS7K7BCNQ3RFDCPVA5CNFSM4DUCYHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENHVWRY#issuecomment-592403271, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEFWPB2YGF7TSXI5RO3WNLRFDCPVANCNFSM4DUCYHRA .
That might be one approach - however, I think you should possibly do a little more research into options that are already well integrated into Sphinx. A quick search revealed this one; there may be others.
Thanks, Ill look into it asap. On Feb 29, 2020 13:05, "Russell Keith-Magee" [email protected] wrote:
That might be one approach - however, I think you should possibly do a little more research into options that are already well integrated into Sphinx. A quick search revealed this one https://github.com/sphinx-contrib/spelling; there may be others.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/beeware/beefore/issues/6?email_source=notifications&email_token=ANEFWPDI3ZXI7IUVSSEENEDRFC5FTA5CNFSM4DUCYHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENLSU7Q#issuecomment-592915070, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEFWPCBC5TWAABAYTUSHHDRFC5FTANCNFSM4DUCYHRA .