pydocx Double quote renders improperly

Double quote renders improperly | HTML

Open bkamapantula opened this issue 10 years ago • 3 comments

Here's the text used in .docx file.

Another work that discusses traffic localization is “A New Approach for Achieving Traffic-Exchange Localization in P2P-based Content Distribution” [2009].

Attached are the screenshots for the text rendered in Google Chrome (v 34.0.1847.132) and Firefox (v 28.0) respectively:

pydocx-quote-issue-chrome pydocx-quote-issue-ff

May 04 '14 15:05 bkamapantula

Thanks for reporting this issue. We're actually (silly us) handling this case outside of pydocx. Here's a workaround you can use until this gets fixed:

    for single_quote in [
            u'&#145;', u'&#146;', u'\u2018', u'\u2019', u'\x91', u'\x92']:
        html = html.replace(single_quote, u"")
    for double_quote in [
            u'&ldquo;', u'&rdquo;', u'\u201c', u'\u201d', u'\x93', u'\x94']:
        html = html.replace(double_quote, u'"')

May 04 '14 15:05 kylegibson

Thanks for this solution. I was testing to report issues.

May 04 '14 16:05 bkamapantula

Do you recall if this output was produced running the pydocx command? For example, pydocx --html input.docx output.html

Thanks, -Kyle

Mar 18 '15 18:03 kylegibson

pydocx pydocx copied to clipboard

Double quote renders improperly | HTML

pydocx
pydocx copied to clipboard