python-docx-template
python-docx-template copied to clipboard
Differences in editors (Libre and Microsoft)
Describe your problem
Hello! I created new docx template from json data. I need to use html in template variables, so I created subdoc. But result file have Differences in editors (Libre and Microsoft). Some text is missing. You can open file in Microsoft Word and see text, and in Libre (it's lost).
When I try to convert document to PDF (soofice cli) - text is missing. 17-3.docx
How can I save imported form HTML text ?
More details about your problem
code:
`#!/usr/bin/python3
import sys import docxtpl import requests import time from docx import Document
from htmldocx import HtmlToDocx
from docx.shared import Mm from docx.shared import Inches import os, io
base_path = '/home/support/domains/' document_id = sys.argv[1] # 9-2022 hostname = sys.argv[2] # template_file = sys.argv[3] # /sites/base/files.docs.docx
doc = docxtpl.DocxTemplate(template_file) document_url = HTTP_URL
resp = requests.get(url=document_url) data = resp.json()
if data: # Parse body If it use html and presave to temp desc_document = Document() new_parser = HtmlToDocx() new_parser.table_style = 'TableGrid' new_parser.paragraph_style = 'Body Text' new_parser.add_html_to_document(data['document']['body'], desc_document) desc_result_path = "/home/support/domains/tmp/part_" + document_id + ".docx" desc_document.save(desc_result_path) sub_doc = doc.new_subdoc(desc_result_path)
context = {
'document_date': time.strftime('%d.%m.%Y', time.gmtime(int(data['document']['created']))),
'document_text': sub_doc,
'document_creator': data['creator'],
'creator_sign': data['creator_sign'],
'document_sender': data['creator'],
'document_sender_role': '',
'document_creator_role': data['document_creator_role'],
}
if ('sign_list' in data):
context['document_sender'] = data['signs_list'][-1]['user_full_name']
context['document_signs'] = data['signs_list']
doc.render(context)
if not os.path.exists(base_path + hostname + "/preview/"):
os.makedirs(base_path + hostname + "/preview/" + hostname)
doc.save(base_path + hostname + "/preview/" + hostname + "/" + document_id + ".docx")
`
I do not understand what you want to do. Please provide a much more simple example that reproduce the problem.
The file obtained as a result of the code operation opens incorrectly in different editors. You can check this by opening the file that I attached to microsoft word and libreoffice and you will see that data has disappeared in libreoffice.
The 'document_text' variable should contain html like "<p>text</p><h1>fooo</h1>'
Simple example
Json = {'document_text':'<p>text</p><h1>fooo</h1>', 'document_title':'Hello world'}
execute code
In result document.
View document in Microsoft World
Hello World text fooo
View document in LibbreOffice
Hello World
File example : 17-3.docx
For test I removed html2doc part of code and leave only subdoc . So problem in part:
doc.new_subdoc("/path/to/document.docx") context = { 'document_text': sub_doc, } doc.render(context) doc.save("/path/tonewfile/document.docx")
HTML rendering is not supported by docxtpl. You can only display not rendered html text with autoescape=True in render(). If you do not escape html, it will destroy document.xml inside the generated docx. The corruped docx may be interpreted differently with different editors.
HTML rendering is not supported by docxtpl. You can only display not rendered html text with autoescape=True in render(). If you do not escape html, it will destroy document.xml inside the generated docx. The corruped docx may be interpreted differently with different editors.
can this be supported? or is there a workaround to this? after one afternoon, I can't found how to convert html to richtext, seems a tricky thing.
Sorry, but there is no obvious way to render html via docxtpl