tika-python
tika-python copied to clipboard
Checkboxes convert to FORMCHECKBOX
Checkboxes from Word documents convert to the text "FORMCHECKBOX" and lose any info about whether or not they are checked. Is it possible to render those differently and ideally maintain the "checked" status?
For instance, a row of checkboxes in a document converted as such (using the xmlContent=True
flag):
<p><b>Current Permanency Plan</b>: FORMCHECKBOX
concurrent plan FORMCHECKBOX
reunification FORMCHECKBOX
adoption FORMCHECKBOX
emancipation/transition FORMCHECKBOX
guardianship </p>