lexical Feature: No way to add Grammarly-like contextual spell-check

trafficstars

My use case is as follows:

Whenever the user updates the text of the editor in any way, a request is made with the new text content of the editor to our custom spellcheck API. This is throttled so that it doesn't actually occur for every character input, only once the user has stopped typing to prevent unnecessary network requests.
The API response returns a list of ranges. Example: [[0, 5], [5, 9], [9, 15]]. I also know if the range is "good text" or "bad text". Let's assume the [5, 9] range is a spelling error and the rest are normal text ranges.
I want to split the editor content into three elements, two normal TextNodes and then a custom node BadSpellingNode that extends TextNode that when hovered shows a contextual menu.

I have tried many solutions to accomplish this. I've tried every variation of listening approaches stated in the documentation. I feel this should be way simpler as I was able to easily accomplish this using slate but abandoned that library once I found that it wasn't typed properly and had a lot of churn resulting from the plate extension project.

Every approach that I utilize either results in an infinite loop: my listener triggers a side effect that updates the text and then triggers the same side effect again. In addition, computing where the caret/selection needs to be on each re-render of the editor (since it is lost and there are now multiple nodes) becomes extremely convoluted.

Am I doing something wrong?

This was previously quite easy in slate. Code isn't super clean here but this was my general approach.

const Leaf = ({ attributes, children, leaf }: RenderLeafProps) => {

  if (leaf.bold) {
    children = <strong>{children}</strong>;
  }

  if (leaf.italic) {
    children = <em>{children}</em>;
  }

  if (leaf.underline) {
    children = <u>{children}</u>;
  }

  const ref = useRef(null);

  const [open, setOpen] = useState(false);
  const [anchorEl, setAnchorEl] = useState<HTMLAnchorElement | null>(null);

  const handleMouseEnter = (event: React.MouseEvent<HTMLSpanElement>) => {
    setAnchorEl(event.currentTarget);
    setOpen(true);
  };

  const handleMouseLeave = (event: React.MouseEvent<HTMSpanElement>) => {
    setAnchorEl(null);
    setOpen(false);
  };
const Leaf = ({ attributes, children, leaf }: RenderLeafProps) => {

  if (leaf.bold) {
    children = <strong>{children}</strong>;
  }

  if (leaf.italic) {
    children = <em>{children}</em>;
  }

  if (leaf.underline) {
    children = <u>{children}</u>;
  }

  const ref = useRef(null);

  const [open, setOpen] = useState(false);
  const [anchorEl, setAnchorEl] = useState<HTMLAnchorElement | null>(null);

  const handleMouseEnter = (event: React.MouseEvent<HTMLSpanElement>) => {
    setAnchorEl(event.currentTarget);
    setOpen(true);
  };

  const handleMouseLeave = (event: React.MouseEvent<HTMSpanElement>) => {
    setAnchorEl(null);
    setOpen(false);
  };

  const strikeColor = 'red';

  const editor = useSlate();

  // Called when a user click to accept a spelling suggestion
  const handleDelete = useCallback(
    (event: React.MouseEvent<HTMLButtonElement>, from: number, to: number, replacementWord: string) => {
      event.preventDefault();
      // Select the data to replace
      Transforms.select(editor, {
        anchor: { path: [0, 0], offset: from },
        focus: { path: [0, 0], offset: to }
      });
      // Create a DataTransfer object
      const dataTransfer = new DataTransfer();
      dataTransfer.setData('text/plain', replacementWord);
      // Replace the data
      ReactEditor.insertTextData(editor, dataTransfer);
    },
    [editor]
  );

  const card = (
    <SpellingCorrectionPopperContainer>
      <CardContent>
        <Stack direction={'row'} spacing={2} alignItems={'center'}>
          <s>
            <SpellingCorrectionOriginalTypography>
              {originalWord}
            </SpellingCorrectionOriginalTypography>
          </s>
          <Box>
            <SpellingCorrectionForwardIcon />
          </Box>
          <SpellingCorrectionButton
            onClick={(event) =>
              handleDelete(
                event,
                leaf.wordRange[0],
                leaf.wordRange[1],
                leaf.replacementWord
              )
            }
          >
            {replacementWord}
          </SpellingCorrectionButton>
        </Stack>

        <Typography marginTop={2}>
          {message}
        </Typography>
      </CardContent>
      <CardActions>
        {/* <Button size="small">Learn More</Button> */}
      </CardActions>
    </SpellingCorrectionPopperContainer>
  );

  return (
    <span
      {...attributes}
      onMouseEnter={
        leaf.badSpelling ? (event) => handleMouseEnter(event) : undefined
      }
      onMouseLeave={
        leaf.badSpelling ? (event) => handleMouseLeave(event) : undefined
      }
      style={{
        textDecoration: leaf.badSpelling ? 'red wavy underline' : undefined
      }}
    >
      {children}

      <Popper open={open} anchorEl={anchorEl} transition>
        {({ TransitionProps }) => (
          <Fade {...TransitionProps} timeout={350}>
            {card}
          </Fade>
        )}
      </Popper>
    </span>
  );
};
  const strikeColor = 'red';

  const editor = useSlate();

  // Called when a user click to accept a spelling suggestion
  const handleDelete = useCallback(
    (event: React.MouseEvent<HTMLButtonElement>, from: number, to: number, replacementWord: string) => {
      event.preventDefault();
      // Select the data to replace
      Transforms.select(editor, {
        anchor: { path: [0, 0], offset: from },
        focus: { path: [0, 0], offset: to }
      });
      // Create a DataTransfer object
      const dataTransfer = new DataTransfer();
      dataTransfer.setData('text/plain', replacementWord);
      // Replace the data
      ReactEditor.insertTextData(editor, dataTransfer);
    },
    [editor]
  );

  const card = (
    <SpellingCorrectionPopperContainer>
      <CardContent>
        <Stack direction={'row'} spacing={2} alignItems={'center'}>
          <s>
            <SpellingCorrectionOriginalTypography>
              {originalWord}
            </SpellingCorrectionOriginalTypography>
          </s>
          <Box>
            <SpellingCorrectionForwardIcon />
          </Box>
          <SpellingCorrectionButton
            onClick={(event) =>
              handleDelete(
                event,
                leaf.wordRange[0],
                leaf.wordRange[1],
                leaf.replacementWord
              )
            }
          >
            {replacementWord}
          </SpellingCorrectionButton>
        </Stack>

        <Typography marginTop={2}>
          {message}
        </Typography>
      </CardContent>
      <CardActions>
        {/* <Button size="small">Learn More</Button> */}
      </CardActions>
    </SpellingCorrectionPopperContainer>
  );

  return (
    <span
      {...attributes}
      onMouseEnter={
        leaf.badSpelling ? (event) => handleMouseEnter(event) : undefined
      }
      onMouseLeave={
        leaf.badSpelling ? (event) => handleMouseLeave(event) : undefined
      }
      style={{
        textDecoration: leaf.badSpelling ? 'red wavy underline' : undefined
      }}
    >
      {children}

      <Popper open={open} anchorEl={anchorEl} transition>
        {({ TransitionProps }) => (
          <Fade {...TransitionProps} timeout={350}>
            {card}
          </Fade>
        )}
      </Popper>
    </span>
  );
};

Mar 12 '23 00:03 SikandAlex

Here is my approach in lexical:

export function SpellcheckPlugin() {
    const [editor] = useLexicalComposerContext()

    const [languageToolOutput, setLanguageToolOutput] = useState(null)
    const [editorText, setEditorText] = useState('')

    useEffect(() => {
      // editor.setEditable(false)
      editor.update(() => {
        console.log('\n\n\n\n\n\n\n\n')
        const root = $getRoot()
        const allChildren = root.getChildren()
        const allChildrenKeys = root.getChildrenKeys()
        const rootParagraphNode = root.getFirstChild()
        const allTextContent = rootParagraphNode?.getTextContent()
        
        // Get the caret position and original anchor
        const selection = $getSelection() as RangeSelection
        console.log(selection)

    
        
        const result = []

        // @ts-ignore
        if (allTextContent?.length && languageToolOutput?.matches.length && selection) {
          const originalAnchorListIndex = allChildrenKeys.indexOf(selection.anchor.key)
          const originalAnchorRangeIndex = selection.anchor.offset
          
          // @ts-ignore
          const matchData = languageToolOutput.matches.map(m => [m.offset, m.offset + m.length])
          const res = getConnectedRanges(allTextContent.length, matchData)
          for (const range of res) {
            const rangeText = allTextContent.substring(range[0], range[1])
            if (isRangeInArray(range, matchData)) {
              const textNode = $createTextNode(rangeText)
              result.push(textNode)
            } else {
              const textNode = $createTextNode(rangeText)
              result.push(textNode)
            }
          }

   
          const newParagraphNode = $createParagraphNode()
          for (const node of result) {
            newParagraphNode.append(node)
          }
          rootParagraphNode?.replace(newParagraphNode)
          const rangeSelection = $createRangeSelection()
          rangeSelection.anchor.key = root.getFirstChildOrThrow().getKey()
          rangeSelection.anchor.offset = 0
          rangeSelection.focus.key = root.getFirstChildOrThrow().getKey()
          rangeSelection.focus.offset = 0
          // const originalIndex = findOriginalIndex(originalAnchorListIndex, originalAnchorRangeIndex, res)
          $setSelection(rangeSelection)



        }
      })
    }, [languageToolOutput])

    const getLanguageToolOutput = (text: string) => {
      languageToolApiClient.check.checkCreate({
        text: text,
        language: 'en-US'
      }).then(res => {
        // @ts-ignore
        // setLanguageToolOutput(res.data)
      })
    }
    

    useEffect(() => {

      // Listen for changes to overall text content in order to refetch LanguageTool output 
      // const removeTextContentListener = editor.registerTextContentListener(
      //   (textContent: string) => {
      //     console.log('Text content listener ran')
      //     getLanguageToolOutput(textContent)
      //   }
      // )

      // Two possibilities are they edit a TextNode or my cuustom node (which extend textNode)
      const removeMutationListener = editor.registerMutationListener(
        ParagraphNode,
        (mutatedNodes) => {
          console.log('ParagraphNode listener')
          const editorState = editor.getEditorState()
          editorState.read(() => {
            const root = $getRoot()
            const rootParagraph = root.getFirstChildOrThrow()
            // console.log(rootParagraph.getTextContent())
          })
        })

         // Two possibilities are they edit a TextNode or my cuustom node (which extend textNode)
      const removeMutationListenerTwo = editor.registerMutationListener(
        TextNode,
        (mutatedNodes) => {

          console.log(mutatedNodes)
          console.log(Array.from(mutatedNodes.keys())[0])

          console.log('TextNode listener')
          const editorState = editor.getEditorState()
          editorState.read(() => {
            const root = $getRoot()
            const rootParagraph = root.getFirstChildOrThrow()
            const allText = rootParagraph.getTextContent()
            getLanguageToolOutput(allText)
          })
        })

      return () => {
        // removeTextContentListener();
        removeMutationListener();
        removeMutationListenerTwo();
      }

    }, [])

    return null

}

Mar 12 '23 00:03 SikandAlex

If anyone can think of an appropriate NodeTransform or MutationListener approach and can avoid the difficulty in recomputing the caret location I would be greatly appreciative, buy you coffee. Thanks for the open source project.

Sorry the code is messy I'll try to clean it up over this weekend.

Mar 12 '23 01:03 SikandAlex

I'm beginning to wonder whether this will only work if I use the registerTextContentListener (because that is technically exactly what I want to listen for) and ensure that the new text content in the container is the exact same as previously. Then, I can register a NodeTransform on the ParagraphNode that contains both my custom node types and have it update without triggering an infinite loop.

Mar 12 '23 02:03 SikandAlex

Cleaned up example attempt at Transforms:


import { useLexicalComposerContext } from '@lexical/react/LexicalComposerContext';
import { LexicalEditor, LexicalNode, ParagraphNode, TextNode, $getRoot, $createParagraphNode, $createTextNode } from 'lexical';
import { useEffect } from 'react';
import { $createCustomNode } from './CustomNode';
import languageToolApiClient from '../../../Managers/LanguageToolApiClient';
import { getConnectedRanges } from './Utils';

export default function CustomNodePlugin() {


const getLanguageToolOutput = (text: string) => {
    return languageToolApiClient.check.checkCreate({
      text: text,
      language: 'en-US'
    })
  }


function customNodeTransform(node: LexicalNode) {

    console.log('ParagraphNode transform executed')

    // Node will be ParagraphNode
    const textContent = node.getTextContent();

    // Update the entire paragraph node 
    editor.update(() => {
        const newParagraphNode = $createParagraphNode()
        const newTextNode = $createTextNode(textContent)
        newParagraphNode.append(newTextNode)
        node.replace(newParagraphNode)
    })

    //
}

  function useCustomNodes(editor: LexicalEditor) {
    useEffect(() => {
      const removeTransform = editor.registerNodeTransform(
        ParagraphNode,
        customNodeTransform,
      );
      return () => {
        removeTransform();
      };
    }, [editor]);
  }

    const [editor] = useLexicalComposerContext();
    useCustomNodes(editor)

    useEffect(() => {
        const removeTextContentListener = editor.registerTextContentListener(
            (textContent) => {
                console.log('Overall text content changed... making LT request')
                getLanguageToolOutput(textContent).then(x => {
                    if (editor) {
                        editor.update(
                            () => {
                                $getRoot()?.getFirstChild()?.markDirty()
                               
                            }
                        )
                    }
                    
                })
            });
        return () => {
          removeTextContentListener();
        }
        
      }, [editor])

    return null;
  }

Mar 12 '23 03:03 SikandAlex

Only thing I can think of now is to manually set the EditorState to avoid triggering an update listener if that's even possible or otherwise thwart the default dirty marking.

Or possibly I don't understand how to use the registerLexicalTextEntity function.

Mar 12 '23 03:03 SikandAlex

I'm closer to a solution with this: Sorry for all the comments I'll clean up this thread later. As I work towards getting this done, I'd recommend that there be some example of something like this or an EditorState that relies on some kind of network request like I'm implementing to a local Docker container running LanguageTool.

import { useLexicalComposerContext } from '@lexical/react/LexicalComposerContext';
import { useLexicalTextEntity } from '@lexical/react/useLexicalTextEntity';
import { CustomNode, $createCustomNode } from './CustomNode';
import { useEffect, useCallback } from 'react';
import { TextNode } from 'lexical';

import languageToolApiClient from '../../../Managers/LanguageToolApiClient';

const getLanguageToolOutput = (text: string) => {
  return languageToolApiClient.check.checkCreate({
    text: text,
    language: 'en-US',
  });
};

export function FinalPlugin(): JSX.Element | null {
  const [editor] = useLexicalComposerContext();

  useEffect(() => {
    const removeTextContentListener = editor.registerTextContentListener(
      (textContent) => {
        // The latest text content of the editor!
        console.log(textContent);
      }
    );
    return () => {
      // Do not forget to unregister the listener when no longer needed!
      removeTextContentListener();
    };
  }, []);

  useEffect(() => {
    if (!editor.hasNodes([CustomNode])) {
      throw new Error('FinalPlugin: CustomNode not registered on editor');
    }
  }, [editor]);

  const createCustomNode = useCallback((textNode: TextNode): CustomNode => {
    return $createCustomNode('testme', textNode.getTextContent());
  }, []);

  const getMatch = useCallback((text: string) => {
    return {
      end: 1,
      start: 0,
    };
  }, []);

  useLexicalTextEntity<CustomNode>(getMatch, CustomNode, createCustomNode);

  return null;
}

Mar 12 '23 04:03 SikandAlex

Have everything I need I think but I don't know how to return multiple matches using getMatch function to registerLexicalTextEntity. Will have to look at the code here: https://github.com/facebook/lexical/blob/beb75cfff522ebddf95193b28aac74e23d807c12/packages/lexical-text/src/index.ts#L150

It might be possible to split the text into enough individual TextNodes and then apply the getMatch repeatedly through those 3.

Mar 12 '23 04:03 SikandAlex

https://github.com/facebook/lexical/blob/main/packages/lexical-react/src/LexicalAutoLinkPlugin.ts Looks like this plugin passes multiple matchers.

Mar 13 '23 00:03 SikandAlex

Hi @SikandAlex ! Can you share a minimal reproducible example?

Mar 13 '23 11:03 milaabl

@milaabl I hope this doesn't sound rude but the point of my issue is that I can't create a minimal reproducible example. As I've discussed, my earlier approaches cause the browser to go into an infinite loop (you don't want to try to run this). I really do appreciate the help though.

My approach right now is to modify registerLexicalTextEntity to accept a list of ranges instead of a getMatch function since I already know the ranges into the total text content of the editor.

Unfortunately, I am still figuring out how to stop infinite loop in the node transform with the necessary pre-conditions. As soon as I have something semi-working... I'll share it.

Mar 14 '23 03:03 SikandAlex

If you want to support inline highlighting of incorrect spellings, possible look at MarkNodes in the playground and how they are using the commenting plugin.

Mar 14 '23 14:03 thegreatercurve

@zurfyx built this internally and may be able to add to the discussion here.

Mar 14 '23 14:03 acywatson

I think my confusion was in the fact that lexical isn't a flat text editor but rather a hierarchy of nodes unlike another text editor I encountered in the past. I'll have to understand more about traversing the hierarchy. I've temporarily swapped to TipTapbecause a user already wrote a plugin that I was able to leverage but I'm interested in returning to Lexical when I have time to migrate over.

Apr 23 '23 18:04 SikandAlex

@zurfyx could you post the example?

Oct 03 '23 14:10 thedjpetersen

also interested in the example that was built internally as we will need something like this soon!

Oct 20 '23 12:10 taismassaro

Could also do with an example of correctly implementing this. I am hoping to use Sapling AI with Lexical which can be used as a drop-in replacement for Grammarly

Feb 08 '24 10:02 robbie-hunt

Bump, would be interested to see example of this!

Feb 12 '24 07:02 matisszemturis

bump

Mar 22 '24 18:03 KalanaPerera

I don't think you will have much luck using node transforms or registerLexicalTextEntity for this, those are synchronous and localized and what you're doing is not. Something like registerTextContentListener would be a reasonable approach, the rest of the work mapping those ranges back into the document tree and then making the appropriate transforms to/from your BadSpellingNode (whether that's an element that wraps text or a text subclass).

A naïve approach would be to do a breadth first search from the root to find the node that maps to a given range (using getTextContentSize probably) then you use that to do your node splitting/wrapping. You'd also need to make sure not to re-wrap nodes that are already marked bad, and unwrap nodes that should no longer be marked bad. It might make sense to first build a whole tree of normal and bad leaf nodes with their associated ranges from the current version of the document, but you will need to iteratively update that as you do your mutations since you will be splitting (marking a new node as bad will result in up to 3 nodes from the original 1) or potentially joining text nodes (removing a bad node could collapse up to 3 nodes into 1) as each range is processed.

Lexical, like HTML, is like a DOM tree and not a flat text document so what you're doing is not really natively supported. Algorithmically, without a separate data structure to cache (and properly invalidate) measurements, working with text ranges is not very efficient for that data model. It makes sense that it would not easily support what you're trying to do in the way you're trying to do it. Updating the size of one node must cascade to every node after it in the document. You can sort of work around this by going backwards (starting by updating the range that comes last in the document, so you don't need updated measurements for nodes that occur later in the doc).

Mar 26 '24 18:03 etrepum

+1 - @zurfyx could you post the example you mentioned above? This just came up for a customer of ours, they find this a highly important feature. We'd appreciate any help with this.

May 10 '24 10:05 busdav

Also interested in that example. Would be great starting point to integrate tools like Grammarly.

May 10 '24 10:05 jpintoic

@zurfyx Can you please drop in some pointers how to implement this correctly?

Jun 04 '24 14:06 lajoskvcs-at-scale

I also have the same problem if anyone is interest I have a partially working example here: https://stackoverflow.com/questions/77791758/lexicaljs-spellchecker

Jun 12 '24 08:06 peerfunk

I noticed this function in the code base $findTextIntersectionFromCharacters which seems to be intended to find a text node + offset from a character position based on the root's textContent.

However, it seems the function was added 3 years ago, and it isn't actually used anywhere. Also, I tested it and it seems to return results that aren't quite right, which might explain why it seems to be have been set aside.

I guess textContent's indices aren't a reliable to find the actual nodes. Probably a lot can go wrong with whitespace, etc

In any case, getting a selection based on root (or ate least paragraph) indices of the textContent (or alternatively by providing a string to find), seems like an important feature to eventually have in lexical.

Sep 19 '24 16:09 PEsteves8

lexical lexical copied to clipboard

Feature: No way to add Grammarly-like contextual spell-check

lexical
lexical copied to clipboard