python-docx-template icon indicating copy to clipboard operation
python-docx-template copied to clipboard

Differences in editors (Libre and Microsoft)

Open dmitryskachkov opened this issue 2 years ago • 4 comments

Describe your problem

Hello! I created new docx template from json data. I need to use html in template variables, so I created subdoc. But result file have Differences in editors (Libre and Microsoft). Some text is missing. You can open file in Microsoft Word and see text, and in Libre (it's lost).

When I try to convert document to PDF (soofice cli) - text is missing. 17-3.docx

How can I save imported form HTML text ?

More details about your problem

code:

`#!/usr/bin/python3

import sys import docxtpl import requests import time from docx import Document

from htmldocx import HtmlToDocx

from docx.shared import Mm from docx.shared import Inches import os, io

base_path = '/home/support/domains/' document_id = sys.argv[1] # 9-2022 hostname = sys.argv[2] # template_file = sys.argv[3] # /sites/base/files.docs.docx

doc = docxtpl.DocxTemplate(template_file) document_url = HTTP_URL

resp = requests.get(url=document_url) data = resp.json()

if data: # Parse body If it use html and presave to temp desc_document = Document() new_parser = HtmlToDocx() new_parser.table_style = 'TableGrid' new_parser.paragraph_style = 'Body Text' new_parser.add_html_to_document(data['document']['body'], desc_document) desc_result_path = "/home/support/domains/tmp/part_" + document_id + ".docx" desc_document.save(desc_result_path) sub_doc = doc.new_subdoc(desc_result_path)

context = {
    'document_date': time.strftime('%d.%m.%Y', time.gmtime(int(data['document']['created']))),
    'document_text': sub_doc,
    'document_creator': data['creator'],
    'creator_sign': data['creator_sign'],
    'document_sender': data['creator'],
    'document_sender_role': '',
    'document_creator_role': data['document_creator_role'],
}
if ('sign_list' in data):
    context['document_sender'] = data['signs_list'][-1]['user_full_name']
    context['document_signs'] = data['signs_list']


doc.render(context)

if not os.path.exists(base_path + hostname + "/preview/"):
    os.makedirs(base_path + hostname + "/preview/" + hostname)
doc.save(base_path + hostname + "/preview/" + hostname + "/" + document_id + ".docx")

`

dmitryskachkov avatar Oct 05 '22 21:10 dmitryskachkov

I do not understand what you want to do. Please provide a much more simple example that reproduce the problem.

elapouya avatar Oct 05 '22 21:10 elapouya

The file obtained as a result of the code operation opens incorrectly in different editors. You can check this by opening the file that I attached to microsoft word and libreoffice and you will see that data has disappeared in libreoffice.

The 'document_text' variable should contain html like "<p>text</p><h1>fooo</h1>'

Simple example

Json = {'document_text':'<p>text</p><h1>fooo</h1>', 'document_title':'Hello world'}

execute code

In result document.

View document in Microsoft World

Hello World text fooo

View document in LibbreOffice

Hello World

File example : 17-3.docx

dmitryskachkov avatar Oct 06 '22 06:10 dmitryskachkov

For test I removed html2doc part of code and leave only subdoc . So problem in part:

doc.new_subdoc("/path/to/document.docx") context = { 'document_text': sub_doc, } doc.render(context) doc.save("/path/tonewfile/document.docx")

dmitryskachkov avatar Oct 06 '22 09:10 dmitryskachkov

HTML rendering is not supported by docxtpl. You can only display not rendered html text with autoescape=True in render(). If you do not escape html, it will destroy document.xml inside the generated docx. The corruped docx may be interpreted differently with different editors.

elapouya avatar Oct 06 '22 09:10 elapouya

HTML rendering is not supported by docxtpl. You can only display not rendered html text with autoescape=True in render(). If you do not escape html, it will destroy document.xml inside the generated docx. The corruped docx may be interpreted differently with different editors.

can this be supported? or is there a workaround to this? after one afternoon, I can't found how to convert html to richtext, seems a tricky thing.

ohplz avatar Nov 14 '22 09:11 ohplz

Sorry, but there is no obvious way to render html via docxtpl

elapouya avatar Nov 15 '22 09:11 elapouya