beefore icon indicating copy to clipboard operation
beefore copied to clipboard

Add a task to spell check documentation

Open freakboy3742 opened this issue 7 years ago • 27 comments

We need a a Beefore task that will do a spell check of any changes to the documentation directory.

freakboy3742 avatar Jul 24 '17 03:07 freakboy3742

Can you elaborate it please @freakboy3742

Logan1x avatar Jul 24 '17 04:07 Logan1x

I'm not sure what more detail you're looking for. Beefore currently has a task for doing things like linting, so it can find (and comment on) code format problems. There are tools available for spell checking ReST documentation; we should create a Beefore task to perform a spell check on any PR that touches the /docs directory.

freakboy3742 avatar Jul 25 '17 00:07 freakboy3742

What technology i need to know before i can on this issue @freakboy3742?

Logan1x avatar Jul 26 '17 05:07 Logan1x

I used Peter Norvig's spell checking algorithm to perform the spell checker. However, the spell checker might work well if there exists a dictionary of words used in the document. I can submit it if you want.

garretvo19 avatar Aug 05 '17 15:08 garretvo19

Yes you can

Logan1x avatar Aug 06 '17 07:08 Logan1x

Hello @freakboy3742 , Can I do this task? I am new to contributing open source and I want to work on this.

ujjaldas1997 avatar Aug 12 '17 18:08 ujjaldas1997

Hey @freakboy3742 , I am a newbie in open source world. If this issue is still open, can I work upon it?

MeghaSharma21 avatar Aug 16 '17 09:08 MeghaSharma21

Hey All - I'm looking at PyEnchant for this. Since nothing has been submitted yet for this, I'm researching how put this together. Feel free to beat me to it.

thomasoflight avatar Aug 18 '17 03:08 thomasoflight

can i do this , Mr.?

ivanzaqqa avatar Sep 13 '17 03:09 ivanzaqqa

@ivanzaqqa I beleive @thomasoflight is working on this - but if he doesn't respond in the next few days, feel free to have a go!

freakboy3742 avatar Sep 13 '17 15:09 freakboy3742

Hi there @ivanzaqqa. I am currently developing this feature, albeit at a first-timer pace. If there's something else that you've seen perhaps consider that one first. Cheers!

thomasoflight avatar Sep 13 '17 20:09 thomasoflight

Okey thanks

On Sep 14, 2017 3:39 AM, "thomasoflight" [email protected] wrote:

Hi there @ivanzaqqa https://github.com/ivanzaqqa. I am currently developing this feature, albeit at a first-timer pace. If there's something else that you've seen perhaps consider that one first. Cheers!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pybee/beefore/issues/6#issuecomment-329289924, or mute the thread https://github.com/notifications/unsubscribe-auth/Ac29GVkwzs1XLlP_eELXm6qU_WiSptkfks5siD1ogaJpZM4OgsHi .

ivanzaqqa avatar Sep 13 '17 23:09 ivanzaqqa

Hello, I was thinking of something along the lines o Peter Norvig's amazing tutorial here: http://norvig.com/spell-correct.html

cyberdrk avatar Oct 24 '17 03:10 cyberdrk

@cyberdrk We don't need a solution built from scratch. There are existing libraries for Sphinx that do spell checking. What we need is a plugin that integrates one of those libraries

freakboy3742 avatar Oct 24 '17 03:10 freakboy3742

Hey @cyberdrk, @freakboy3742 - A while back I posted this gist which documents my initial efforts to integrate Sphinx with Beefore. @freakboy3742 do you know of any resources that can outline this process? The Sphinx documentation contains a lot of information which is challenging. @cyberdrk I'd love to work with you on this or see how you're solving it. I'm currently stepping through the Beefore codebase to see how the pieces fit together as I write a rough draft.

thomasoflight avatar Oct 24 '17 13:10 thomasoflight

Hi, I'm a first time contributor but if there aren't any objections, I would like to take a crack at this issue.

zifn avatar May 15 '18 16:05 zifn

I noticed that the sphinx spell checker is using a library (pyenchant) that is no longer being maintained and I didn't see another library that is as mature as that one for spell checking. How should I proceed?

zifn avatar May 15 '18 21:05 zifn

@freakboy3742 Hi I just want to confirm with you what's involved in addressing this issue. To make a new, Beefore task does this involve writing a new module (pyspellingbee ?) that uses the spell checker libraries on the diffs of .rst files or other files in the docs folder.

zifn avatar May 16 '18 14:05 zifn

@freakboy3742 @cyberdrk @thomasoflight @ivanzaqqa So here's what I'm thinking thus far. It seams like to add a beekeeper task for spell check, a new file (possibly named spellingbee.py) will need to be added to the beefore\checks directory. The required modules (pyenchant, maybe sphinxcontrib-spelling along with sphinx depending on the spell check implementation) will also need to be installed as well as the enchanct c lib which pyenchant is a wrapper for (also note that as of May 2018 pyenchant is no longer supported and is looking for a new supporting developer). To install these libraries, the install_requires parameter in setup.py may need to be modified to include the appropriate modules. Another solution to this problem could be to use the ?potentially? required function def prepare(directory) for modules in the beefore\checks directory which could possibly be used to manually install the needed c libraries and modules to run the checks. To preform the spell check one option could be to use the pyenchant lib directly on the diffs obtained from github files in the docs directory using the suggested spelling to generate terminal output to be used in correcting spelling errors in the docs. Also either the different repos that use this spell check task will need to start maintaining a dictionary file or that file will need to be added to the beefore repo in order to add exceptions to words like BeeWare and Beefore which aren't mispelled but aren't in the standard English dictionary. To test this modification, the beekeeper yaml file will need to be updated to use the spellingbee.py task. It's not clear to me what exactly needs to be changed in the beekeeper yaml file or what functions/class defs are and are not required to make the spell check task file and it seams like that a lot of that depends on how the beekeeper runs beefore. I've been looking at the beekeeper repo to find this information but the docs for it aren't very complete so I've been stepping through the beekeeper repo trying to find some of this information. At this point I'm starting to run up against the end of my vacation time so I'm not going to be able to spend as much time on this ticket as I could the last couple of days. I did find a couple of spelling errors in the docs for this repo so I'll submit a PR for that small fix (see #19). Does this description of the ticket sound reasonable? Also, given the number of files that need to be touched and that this issue seams to require some understanding of how this repo interacts with the main CI is this issue appropriate for first time contributors?

zifn avatar May 18 '18 06:05 zifn

@zifn It sounds like you're on the right general track. Adding extra dependencies in install_requires is no problem; the rest of the task is to wrap the calls necessary to start a spell checker over the documentation directory.

Sphinx has some spelling check functionality (both built in, as as plugins), covering most of the cases you've described (e.g., having a "local" dictionary of words known to be OK); I'd expect to see those native tools used for the heavy lifting (just as we use flake8 to check for code style errors). The bulk of the task is collating the spelling issues with the lines that are covered by the patch (so you don't report a spelling error with code/text that isn't actually in the patch).

There's definitely some detail that needs to be worked through, but I can't see any reason that this couldn't be tackled by a first time contributor.

freakboy3742 avatar May 18 '18 13:05 freakboy3742

Hello @freakboy3742 ! If the issue still persists, I'd love to help out :D

devarshigoswami avatar Feb 27 '20 10:02 devarshigoswami

@devarshigoswami Yes - the issue still exists! However, it's safe to say that the scope has changed a little; we've been moving our CI infrastructure to Github Actions, and reducing our use of BeeFore and BeeKeeper.

If you wanted to try a contribution here, I'd suggest trying to add a "Spell check" step to Briefcase's CI definition. Although briefcase is using Beefore to check the Python formatting, using Beefore is not required for the spell check.

freakboy3742 avatar Feb 27 '20 23:02 freakboy3742

Tanks for the prompt reply! :) since I'm relatively a noob, Could you give me a little more specifics about the task? " add a "Spell check" step to Briefcase's CI definition https://github.com/beeware/briefcase/blob/master/.github/workflows/ci.yml " : does the solution need to be yaml specific since its an .yml file? do I check it against a generic dictionary or might there be errors in the code as well?

On Fri, 28 Feb 2020 at 05:00, Russell Keith-Magee [email protected] wrote:

@devarshigoswami https://github.com/devarshigoswami Yes - the issue still exists! However, it's safe to say that the scope has changed a little; we've been moving our CI infrastructure to Github Actions, and reducing our use of BeeFore and BeeKeeper.

If you wanted to try a contribution here, I'd suggest trying to add a "Spell check" step to Briefcase's CI definition https://github.com/beeware/briefcase/blob/master/.github/workflows/ci.yml. Although briefcase is using Beefore to check the Python formatting, using Beefore is not required for the spell check.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/beeware/beefore/issues/6?email_source=notifications&email_token=ANEFWPCYIFBRHY4D4SGBVZDRFBEKBA5CNFSM4DUCYHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENGLSHA#issuecomment-592230684, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEFWPDH5MEHHKFTRRYCIBLRFBEKBANCNFSM4DUCYHRA .

devarshigoswami avatar Feb 28 '20 07:02 devarshigoswami

So the task, in the most high level, "use case" terms is this:

As a project maintainer, I want to avoid ever merging a PR that contains a change to documentation with a spelling error, or an error in markup.

The ci.yml file I referenced is the configuration of Github actions. Those are the commands that get executed every time a contributor submits a pull request. You could run the same commands yourself by hand; the configuration file gives Github enough detail to run them automatically.

So - the task really has 2 parts:

  1. Work out how, from the command line, you can verify that there are no spelling errors or markup errors in the Sphinx documentation for the project

  2. Work out how to configure Github Actions to invoke those commands.

The markup errors task is relatively straightforward - if you try to build the documentation, and there's a markup error, you'll find the return code of the build is non-zero - that's the usual Unix response for "this command raised an error". If a Github Action returns a non-zero return code, that build step will fail.

The spell check task will be similar - but you'll need to work out how to run a spell check over a Sphinx documentation directory (including adding/excluding words that are spelled correctly, but aren't in the spell checker's dictionary).

I'm not especially concerned about spelling errors in code - with the possible exception of function docstrings, if the project's documentation includes them.

freakboy3742 avatar Feb 28 '20 08:02 freakboy3742

Hey there , Russell! I believe we can solve the spelling check problem with this. maybe add apt aspell addons: apt: packages: aspell

and then add a shell script to do the checking.? script: spellcheck.sh am I in same page?

On Fri, 28 Feb 2020 at 15:59, Devarshi Goswami [email protected] wrote:

Thanks a bunch. I will try my best to work on this. :)

On Fri, 28 Feb 2020 at 13:51, Russell Keith-Magee < [email protected]> wrote:

So the task, in the most high level, "use case" terms is this:

As a project maintainer, I want to avoid ever merging a PR that contains a change to documentation with a spelling error, or an error in markup.

The ci.yml file I referenced is the configuration of Github actions. Those are the commands that get executed every time a contributor submits a pull request. You could run the same commands yourself by hand; the configuration file gives Github enough detail to run them automatically.

So - the task really has 2 parts:

Work out how, from the command line, you can verify that there are no spelling errors or markup errors in the Sphinx documentation for the project 2.

Work out how to configure Github Actions to invoke those commands.

The markup errors task is relatively straightforward - if you try to build the documentation, and there's a markup error, you'll find the return code of the build is non-zero - that's the usual Unix response for "this command raised an error". If a Github Action returns a non-zero return code, that build step will fail.

The spell check task will be similar - but you'll need to work out how to run a spell check over a Sphinx documentation directory (including adding/excluding words that are spelled correctly, but aren't in the spell checker's dictionary).

I'm not especially concerned about spelling errors in code - with the possible exception of function docstrings, if the project's documentation includes them.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/beeware/beefore/issues/6?email_source=notifications&email_token=ANEFWPGVHJP5TNS7K7BCNQ3RFDCPVA5CNFSM4DUCYHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENHVWRY#issuecomment-592403271, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEFWPB2YGF7TSXI5RO3WNLRFDCPVANCNFSM4DUCYHRA .

devarshigoswami avatar Feb 29 '20 05:02 devarshigoswami

That might be one approach - however, I think you should possibly do a little more research into options that are already well integrated into Sphinx. A quick search revealed this one; there may be others.

freakboy3742 avatar Feb 29 '20 07:02 freakboy3742

Thanks, Ill look into it asap. On Feb 29, 2020 13:05, "Russell Keith-Magee" [email protected] wrote:

That might be one approach - however, I think you should possibly do a little more research into options that are already well integrated into Sphinx. A quick search revealed this one https://github.com/sphinx-contrib/spelling; there may be others.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/beeware/beefore/issues/6?email_source=notifications&email_token=ANEFWPDI3ZXI7IUVSSEENEDRFC5FTA5CNFSM4DUCYHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENLSU7Q#issuecomment-592915070, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEFWPCBC5TWAABAYTUSHHDRFC5FTANCNFSM4DUCYHRA .

devarshigoswami avatar Feb 29 '20 10:02 devarshigoswami