clipboard2markdown icon indicating copy to clipboard operation
clipboard2markdown copied to clipboard

Bold Texts

Open spekulatius opened this issue 3 years ago • 11 comments

Hey @euangoddard

super long shoot here. I know well this might never happen. I still give it a try. I noticed all bold texts are dropped when I use https://euangoddard.github.io/clipboard2markdown/. Any chance this could be added?

Cheers, Peter

spekulatius avatar May 16 '21 19:05 spekulatius

Thanks for reporting this. It would be good to have a reproduction of this. If I had to guess then I would suspect that the bold text in the source only appears bold and isn't actually marked as bold in a way the processor understands, i.e. it has some style applied to make it seem bold. Whenever I have tested this with formatted text it always translates correctly. Perhaps if you can attach an example source document I can have a look at what's going on?

euangoddard avatar May 17 '21 07:05 euangoddard

Hello @euangoddard,

yeah, I understand it might look bold, but isn't actually bolded. Just displayed like this.

It's happening when I use Google Docs. I've created an example here: https://docs.google.com/document/d/17tpX1h53phZ-64uP4flTiOJIMQrpvEuCcdf49hdvV2A/edit?usp=sharing

Cheers, Peter

spekulatius avatar May 17 '21 08:05 spekulatius

I suspect it is a google docs issue. When you export a google doc as HTML I know the bold sections in there aren't represented with <strong> or <b> tags but are simply styled so I suspect this has the same root cause as is manifesting here. It would be pretty hard to solve the problem - you'd have to detect styles that looked bold and then convert those, on the fly, to semantically bold tags prior to processing. Happy to consider any PR that addresses this

euangoddard avatar May 17 '21 08:05 euangoddard

Hmmm, okay. I guessed something along these lines when opening the issue. I've seen the bold part working with other sources. So I guessed there is something non-standard going on.

I've got limited time atm. For the next weeks surely no time for any side-projects. Maybe after this period.

spekulatius avatar May 17 '21 10:05 spekulatius

Fantastic that you're at least keen to try! I think it could be really quite a tricky problem to solve as you'll need to traverse the entire node tree and replace styles that look bold with semantic elements. Good luck!

euangoddard avatar May 17 '21 10:05 euangoddard

That sounds like an interesting challenge. Not sure I'll manage tho

spekulatius avatar May 17 '21 10:05 spekulatius

Hmm, I might not make a full PR for this, but adding this to clipboard2markdown.js has worked for converting Reddit's bold text properly for me:



    {
      filter: function (node) {
        // TODO: check other font-weights
        return node.style.fontWeight && (node.style.fontWeight > 500);
      },
      replacement: function (content) {
        return '**' + content + '**';
      }
    },

DanielOaks avatar Dec 18 '21 07:12 DanielOaks

That's interesting. I am current re-writing the project from scratch at the moment and can definitely include this patch in there. It makes sense to me to support these visually styled bold elements

euangoddard avatar Dec 19 '21 09:12 euangoddard

Supporting Italics would be nice as well! My use case is also Google Docs -> Markdown

colinbrislawn avatar Sep 27 '22 19:09 colinbrislawn

@colinbrislawn that's interesting that google docs paste doesn't support italics. When I've looked into this previously it seemed to work. I know that some rich text editors don't implement the rich text as semantic elements which makes it quite difficult to resolve the original author's formatting intent. The approach mentioned by @DanielOaks could certainly be used for the italic approach as well.

euangoddard avatar Sep 28 '22 07:09 euangoddard

Just experienced the same using text copied from SharePoint/Word using macOS Safari. Headers would be nice too, which comes across via "data-ccp-parastyle" according to https://dynalist.io/clipboard

mackaaij avatar Dec 23 '22 13:12 mackaaij