Could HTML style "font-style" be analysed in order to conserve bold/italic text, please?

Open 101101100 opened this issue 2 years ago • 0 comments

I have a big PDF-->HTML converted document, containing lots of bold and/or italic emphasizes. It is my goal to be able to copy the text with the emphasizes.

The HTML original does not use ... tags, but spans with style definitions as in the example: erschienenen <span class="font5" style="font-style:italic;">Handbuches</span>

Current Markdownload plugin copies this as formats CF_TEXT (ID 1) and CF_UNICODETEXT (ID 13) to the clipboard, and does not preserve any emphasis. Both entries are just the plain text erschienenen Handbuches

I would enjoy to find erschienenen _Handbuches_ instead.

I did not find anything I can do on myself to include this coding variant into the detected variants, would be glad to learn a way.

Dec 25 '23 17:12 101101100