dproofreaders icon indicating copy to clipboard operation
dproofreaders copied to clipboard

Add a wordcheck API

Open 70ray opened this issue 1 year ago • 8 comments

Suggested implementation: Assuming that wordcheck is integrated with the proofreading page, a number of functions would be needed:

Basic wordcheck function

this would need to identify the project so the project good and bad words could be used but would not need to identify the page or user. Send:

  • a piece of text
  • a set of languages
  • a set of ad-hoc words (possibly empty).

Receive:

  • An array of 'objects' each consists of a piece of text and a code indicating one of
    • normal text
    • WC_WORLD
    • WC_SITE
    • WC_PROJECT
    • WC_PAGE
    • punctuation
    • text in 'wrong' language (code gives language)
  • A list of uncommon scripts found

The client would use this information to construct the page to show to the user

Finalise wordcheck function

This would identify the project, page and user and invoke the save_wordcheck_event() function. If the user had previously entered wordcheck mode it would be called when the user saves the page as done, or possibly also as-in-progress. Sends the set of suggested ad-hoc words (possibly empty).

Get the set of languages with dictionaries.

70ray avatar Sep 30 '24 16:09 70ray

The way it works now only WC_WORLD bad words are suggestible so we don't need to distinguish the others for the purpose of the API.

70ray avatar Oct 01 '24 18:10 70ray

Where the same suggestible word occurs more than once on a page the client needs to know this. Does this affect the function above or not?

70ray avatar Oct 01 '24 18:10 70ray

Regarding the 'finalise' or 'report' wordcheck function, a better plan could be to send the set of accepted words as part of the save and checkin functions. This would ensure an appropriate page state and user. The array could be empty. Not sending any array would indicate wordcheck had not been run. Or send another variable to indicate whether wordcheck had been run or not.

70ray avatar Oct 03 '24 08:10 70ray

Part of this was implemented in https://github.com/DistributedProofreaders/dproofreaders/pull/1379

cpeel avatar Jan 19 '25 03:01 cpeel

As @cpeel said part of "Basic wordcheck function" and "Finalise wordcheck function" were implemented in #1379 though in a different way to what was suggested here.

The "Get the set of languages with dictionaries" part is in merge request #1406.

Another part is to get words with uncommon or perhaps multiple scripts. PR to follow.

70ray avatar Jan 19 '25 09:01 70ray

Finalise wordcheck function

This would identify the project, page and user and invoke the save_wordcheck_event() function. If the user had previously entered wordcheck mode it would be called when the user saves the page as done, or possibly also as-in-progress. Sends the set of suggested ad-hoc words (possibly empty).

This use does not allow for the case where wordcheck was run and the user leaves the page without saving it (so it is still in the "out" state). It should rather be called when the user leaves the page.

70ray avatar Jan 19 '25 09:01 70ray

@70ray - are there other WordCheck APIs we need?

cpeel avatar Mar 25 '25 12:03 cpeel

@70ray - are there other WordCheck APIs we need?

I don't know if we need anything for words with uncommon scripts. Javascript can't easily do what php utf8_char_script() does, but if you know what scripts you are dealing with already then you can find the words with them. Perhaps we could have a way to find what scripts are possible from the Charsuites including any special characters for the particular project, then javascript could do whatever we think appropriate.

70ray avatar Mar 25 '25 16:03 70ray