Add a wordcheck API
Suggested implementation: Assuming that wordcheck is integrated with the proofreading page, a number of functions would be needed:
Basic wordcheck function
this would need to identify the project so the project good and bad words could be used but would not need to identify the page or user. Send:
- a piece of text
- a set of languages
- a set of ad-hoc words (possibly empty).
Receive:
- An array of 'objects' each consists of a piece of text and a code indicating one of
- normal text
- WC_WORLD
- WC_SITE
- WC_PROJECT
- WC_PAGE
- punctuation
- text in 'wrong' language (code gives language)
- A list of uncommon scripts found
The client would use this information to construct the page to show to the user
Finalise wordcheck function
This would identify the project, page and user and invoke the save_wordcheck_event() function. If the user had previously entered wordcheck mode it would be called when the user saves the page as done, or possibly also as-in-progress. Sends the set of suggested ad-hoc words (possibly empty).
Get the set of languages with dictionaries.
The way it works now only WC_WORLD bad words are suggestible so we don't need to distinguish the others for the purpose of the API.
Where the same suggestible word occurs more than once on a page the client needs to know this. Does this affect the function above or not?
Regarding the 'finalise' or 'report' wordcheck function, a better plan could be to send the set of accepted words as part of the save and checkin functions. This would ensure an appropriate page state and user. The array could be empty. Not sending any array would indicate wordcheck had not been run. Or send another variable to indicate whether wordcheck had been run or not.
Part of this was implemented in https://github.com/DistributedProofreaders/dproofreaders/pull/1379
As @cpeel said part of "Basic wordcheck function" and "Finalise wordcheck function" were implemented in #1379 though in a different way to what was suggested here.
The "Get the set of languages with dictionaries" part is in merge request #1406.
Another part is to get words with uncommon or perhaps multiple scripts. PR to follow.
Finalise wordcheck function
This would identify the project, page and user and invoke the save_wordcheck_event() function. If the user had previously entered wordcheck mode it would be called when the user saves the page as done, or possibly also as-in-progress. Sends the set of suggested ad-hoc words (possibly empty).
This use does not allow for the case where wordcheck was run and the user leaves the page without saving it (so it is still in the "out" state). It should rather be called when the user leaves the page.
@70ray - are there other WordCheck APIs we need?
@70ray - are there other WordCheck APIs we need?
I don't know if we need anything for words with uncommon scripts. Javascript can't easily do what php utf8_char_script() does, but if you know what scripts you are dealing with already then you can find the words with them. Perhaps we could have a way to find what scripts are possible from the Charsuites including any special characters for the particular project, then javascript could do whatever we think appropriate.