html2docx icon indicating copy to clipboard operation
html2docx copied to clipboard

'HtmlToDocx' object has no attribute 'run'

Open Amritpal2001 opened this issue 3 years ago • 8 comments

Hey, I am getting this issue sometimes while converting from HTML to docs. Screenshot 2022-03-30 at 10 30 55 PM

Thanks!

Amritpal2001 avatar Apr 01 '22 07:04 Amritpal2001

Same here

maxamly avatar May 29 '22 21:05 maxamly

same here, any solution?

sanaullahaq avatar Dec 07 '22 05:12 sanaullahaq

I could be wrong, but what I have found is when I try to convert HTML(table with empty/blank cell) to Docx when the error occurs.

sanaullahaq avatar Dec 15 '22 11:12 sanaullahaq

I can confirm this error occurs if you put a <br> tag right at the start of a <td>. If you put anything before the <br> then it seems to work fine. For example:

<td><br>Hello world</td> throws an error <td>Hello world<br></td> does not

I would guess that the run needs to be initialised somewhere. If some content precedes the <br> then the run has already been created by the time the <br> is parsed, but when the <br> is the first child of the <td> then the run attribute is missing which causes the error.

dashingdove avatar Feb 06 '23 08:02 dashingdove

The error also occurs when adding a <br> to the start of a document.

document = docx.Document()
html_parser = htmldocx.HtmlToDocx()
html_parser.add_html_to_document('<br>', document) #AttributeError

Basically, if the first thing that the parser sees is a <br> then it throws an error. In the table cell example, a child parser gets created to parse the contents of the cell so it's exactly the same issue.

dashingdove avatar Feb 07 '23 04:02 dashingdove

+1 on this issue

steps to replicate:

from docx import Document
from htmldocx import HtmlToDocx

document = Document()
new_parser = HtmlToDocx()

html = '<table><tr><td><br>testing</td></tr></table>'
new_parser.add_html_to_document(html, document)

ptkinvent avatar Mar 13 '24 18:03 ptkinvent

+1

pierreavn avatar Jul 03 '24 15:07 pierreavn