lexical icon indicating copy to clipboard operation
lexical copied to clipboard

Feature: Copy Paste Bullet/Number list from MS Word Document

Open muleyprasanna opened this issue 2 years ago • 4 comments

Lexical version: 0.11.1

Steps To Reproduce

  1. Copy bullet/numbered list from word document
  2. Paste it in Lexical Rich Text Editor
  3. The list not getting render in lexical bullet/numbered list format

Link to code example:

The current behavior

Copy pasted bullet/numbered list aren't displayed correctly

The expected behavior

Copy pasted bullet/numbered list need to be displayed correctly

MS Word List Screenshot BulletList_Bug1

Rich Text Editor Screenshot (After bullet list paste from word document) BulletList_Bug2

Rich Text Editor Screenshot (After numbered list paste from word document) BulletList_Bug3

Rich Text Editor Screenshot (Expected bullet list behaviour) BulletList_Expected1

Rich Text Editor Screenshot (Expected numbered list behaviour) BulletList_Expected2 (url)

muleyprasanna avatar Jul 10 '23 12:07 muleyprasanna

Can you help me out by pasting in here the HTML that MS Word puts on the clipboard? The issue is going to be that they're not using semantic HTML.

acywatson avatar Jul 26 '23 21:07 acywatson

BulletList.txt NumberedList.txt

We are getting the html from the clipboard. PFA. Thanks you very much for your response.

muleyprasanna avatar Jul 27 '23 06:07 muleyprasanna

I am just keeping eyes on the progress of this one, I was testing this morning, and I am getting the same issue

MauricioGrM avatar Feb 18 '25 16:02 MauricioGrM

Lexical's behavior is correct, Word isn't putting semantic HTML on the clipboard and Lexical is just importing what it sees. You'll get the same behavior if you paste this HTML into Google Docs for example.

I'm sure you could implement a workaround with importDOM and/or the editor's html.import configuration but having specific support for this non-semantic MS Word HTML dialect is not something that's likely to be included in the core by default.

etrepum avatar Feb 18 '25 17:02 etrepum

Just blogged about this explaining a workaround for several different frameworks including lexical, hoping it's helpful 🙏

juliankrispel avatar Jul 14 '25 06:07 juliankrispel

@juliankrispel did you tested this code with lexical? I've tried in local its not working as expected. Also the markup exported from word appears to be completely different than what implemented in blog post. Any idea if am I doing something wrong?

Image

harshmetkel24 avatar Jul 15 '25 08:07 harshmetkel24

Hi @etrepum, I’ve tried an approach that addresses most cases. I understand that this type of change may not be suitable for inclusion in the Lexical core, but I’ve opened a PR in case others facing the same issue might find it helpful or inspirational. I would also appreciate it if you could take a look and let me know if anything is incorrect or could be improved. Additionally, I’ve noticed that content with applied font-size, when copied from sources like Google Docs or MS Office, is not working as expected. Do we have an existing open issue for this, or any quick tips to address it? Thank you!

harshmetkel24 avatar Jul 21 '25 10:07 harshmetkel24

The solution to addressing any HTML import issue is addressed by my previous comment, you need to override importDOM or the html config accordingly.

etrepum avatar Jul 21 '25 14:07 etrepum

@juliankrispel did you tested this code with lexical? I've tried in local its not working as expected. Also the markup exported from word appears to be completely different than what implemented in blog post. Any idea if am I doing something wrong?

I certainly did - using something similar for a client. I can't tell you what you did wrong because I can't see your code :)

juliankrispel avatar Jul 31 '25 04:07 juliankrispel

The solution to addressing any HTML import issue is addressed by my previous comment, you need to override importDOM or the html config accordingly.

I don't think that's quite the right approach tbh, since word doesn't output semantic html

juliankrispel avatar Jul 31 '25 04:07 juliankrispel

That's exactly why you have to override the default implementation, the one that's designed for semantic markup.

etrepum avatar Jul 31 '25 04:07 etrepum

That's exactly why you have to override the default implementation, the one that's designed for semantic markup.

I'm mot sure I follow tbh. I'll try and explain once more. Here's my take: The import configuration is expecting semantic html. It therefore makes more sense to first make the html semantic, otherwise you're just polluting the import configuration with word-specific and then google-specific and then whatever other crappy content you need to parse. This is why I and the clients I have implemented this for with lexical and other frameworks are generally happier with separating these concerns.

juliankrispel avatar Aug 01 '25 11:08 juliankrispel