.docx to TEI P5 XML Document conversion fails
Can you help me? Our other files are ok, only this one doesn't work. Whats wrong? Kind regards, Henrike
Error occured. Please check the filetype and try again.?
Error: class pl.psnc.dl.ege.exception.ConverterException
Processing terminated by xsl:message at line 130 in fields.xsl
I did a little debugging and the error I get (from running on the command line) is
fldSimple: unrecognized type REF BMfig_wheel \* MERGEFORMAT
This originates from the word file here:
<w:fldSimple w:instr="REF BMfig_wheel \* MERGEFORMAT ">
<w:r w:rsidRPr="005B4B5A">
<w:rPr>
<w:rStyle w:val="AbbVerweiszfdgZchn"/>
</w:rPr>
<w:t>1</w:t>
</w:r>
</w:fldSimple>
-- which is the "1" reference in "The wheel (Figure 1) is constructed …"
I'm no docx expert, so I do not know which (arcane) feature this is and how to treat it right. Hence, I'd like to close it here and move it to the Stylesheets issues if anyone thinks we should follow up on this?!
Running this online with docxtotei produces the (slightly) more helpful error message:
[xslt] fldSimple: unrecognized type REF BMfig_wheel * MERGEFORMAT
which appears to relate to the reference to a graphic in section 2.2 :
"The wheel (Figure 1) is constructed in the fashion of a color wheel"
I don't have Word here, so I cannot be sure. However, if I delete that parenthesized reference, save the file as DOCX, and try the conversion again, everything works fine.
Maybe the problem is that the graphic file isn't included in the document?
On 11/12/2019 15:00, fricke-steyer wrote:
Can you help me? Our other files are ok, only this one doesn't work. Whats wrong? Kind regards, Henrike
emotion_analysis_2019.docx https://github.com/TEIC/oxgarage/files/3863581/emotion_analysis_2019.docx
Error occured. Please check the filetype and try again.?
Error: class pl.psnc.dl.ege.exception.ConverterException
Processing terminated by xsl:message at line 130 in fields.xsl
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/TEIC/Stylesheets/issues/405?email_source=notifications&email_token=AAFBJ5HW4A3Y7KHTRFWIOWDQYD57TA5CNFSM4JZQLPS2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H7ZHFUA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFBJ5E6VSSECMBXVBDHIJLQYD57TANCNFSM4JZQLPSQ.
Rather than opening a new issue, I post here another Word file that causes the Stylesheets to fail. At first glance it looks easier to fix than the previous one, the error is:
A sequence of more than one item is not allowed as the first argument of fn:starts-with() ("VAROVALKE_1_brez ozadja copy", "VAROVALKE_2_brez ozadja copy") ; SystemID: file:/project/tei/convert/Stylesheets/docx/from/graphics.xsl; Line#: 83; Column#: 12
Council F2F group looked at this and the problem is actually a pointer to something that does not exist in the Word document itself. We (me and @martinascholger and @joeytakeda) think that in fields.xsl, at line 129, we should not terminate the processing but instead output a <hi> element with an error flag in the @rend attribute and then apply-templates to provide some helpful content.
Re this issue, @TomazErjavec:
Rather than opening a new issue, I post here another Word file that causes the Stylesheets to fail. At first glance it looks easier to fix than the previous one, the error is:
A sequence of more than one item is not allowed as the first argument of fn:starts-with() ("VAROVALKE_1_brez ozadja copy", "VAROVALKE_2_brez ozadja copy") ; SystemID: file:/project/tei/convert/Stylesheets/docx/from/graphics.xsl; Line#: 83; Column#: 12
If this is still an issue, could you please open a new issue for the error?
@TomazErjavec similarly to the other issue, one workaround would be to use TEI Publisher's conversion, attaching the results
TEI_Stylesheet_crash-test.docx.xml
@fricke-steyer for the document you posted, Publisher's conversion is also successful