docx4j icon indicating copy to clipboard operation
docx4j copied to clipboard

AlternateContent within RPr: "[ERROR] : unexpected element (uri:"http://schemas.openxmlformats.org/markup-compatibility/2006", local:"AlternateContent")"

Open jhrtl opened this issue 3 years ago • 2 comments

Observation

While constructing a unit-test for my application and constructing a simple hello world document for it, i received this warning and error when loading the document and its' main part.

WARN  o.d.j.JaxbValidationEventHandler - [ERROR] : unexpected element (uri:"http://schemas.openxmlformats.org/markup-compatibility/2006", local:"AlternateContent"). Expect
WARN  o.d.j.JaxbValidationEventHandler - Column is 2496 at line number 2
INFO  o.d.j.JaxbValidationEventHandler - shouldContinue is set to false
WARN  o.d.o.p.JaxbXmlPartXPathAware - unexpected element (uri:"http://schemas.openxmlformats.org/markup-compatibility/2006", local:"AlternateContent"). Expected elements are <{http://schemas.openxmlformats.org/wordprocessingml/2006/main}webHidden>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}szCs>,<{http://schemas.microsoft.com/office/word/2010/wordml}textOutline>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}shadow>,<{http://schemas.microsoft.com/office/word/2010/wordml}reflection>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}specVanish>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}rStyle>,<{http://schemas.microsoft.com/office/word/2010/wordml}numSpacing>,<{http://schemas.microsoft.com/office/word/2010/wordml}stylisticSets>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}emboss>,<{http://schemas.microsoft.com/office/word/2010/wordml}glow>,<{http://schemas.microsoft.com/office/word/2010/wordml}ligatures>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}vanish>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}rFonts>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}kern>,<{http://schemas.microsoft.com/office/word/2010/wordml}shadow>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}em>,<{http://schemas.microsoft.com/office/word/2010/wordml}scene3d>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}shd>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}fitText>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}effect>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}position>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}iCs>,<{http://schemas.microsoft.com/office/word/2010/wordml}props3d>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}smallCaps>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}imprint>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}color>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}caps>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}snapToGrid>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}spacing>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}outline>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}dstrike>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}highlight>,<{http://schemas.microsoft.com/office/word/2010/wordml}cntxtAlts>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}oMath>,<{http://schemas.microsoft.com/office/word/2010/wordml}textFill>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}lang>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}b>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sz>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}strike>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}eastAsianLayout>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}rtl>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}i>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}bCs>,<{http://schemas.microsoft.com/office/word/2010/wordml}numForm>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}bdr>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}cs>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}noProof>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}w>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}u>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}vertAlign>,<{http://schemas.openxmlformats.org/wordprocessingml/2006/main}rPrChange>

Analysis

The position mentioned refers to an AlternateContent within a RPr. Simplified overview:

<w:body>
	<w:p w:rsidR="00FC58C4" w:rsidRPr="00D601A0" w:rsidRDefault="00062ADE">
		<w:r w:rsidR="00D601A0" w:rsidRPr="00D601A0">
			<w:rPr>
				<mc:AlternateContent>
					<mc:Choice Requires="w16se"/>
					<mc:Fallback>
						<w:rFonts w:ascii="Segoe UI Emoji" w:eastAsia="Segoe UI Emoji" w:hAnsi="Segoe UI Emoji" w:cs="Segoe UI Emoji"/>
					</mc:Fallback>
				</mc:AlternateContent>
				<w:color w:val="C00000"/>
			</w:rPr>
		</w:r>
	</w:p>
</w:body>

If this AlternateContent is removed from xml, the error no longer occurs.

Theories

This problem seems to be specific to Word 2019 or newer patch levels. An older version of Office 2016 would not encode my document with such an AlternateContent.

I guess the technical reason for docx4j failing would be that neither org.docx4j.wml.RPr or it's parent classes define AlternateContent as a possible child element.

Example and reproduction

I attached a minimal maven project to reproduce the issue (also containing my docx file): example.zip

jhrtl avatar Apr 16 '21 17:04 jhrtl

Thanks for this. As things stand, docx4j uses XSLT to convert the w:rPr to:

			<w:rPr>
				<w:rFonts w:ascii="Segoe UI Emoji" w:eastAsia="Segoe UI Emoji" w:hAnsi="Segoe UI Emoji" w:cs="Segoe UI Emoji"/>
				<w:color w:val="C00000"/>
			</w:rPr>

and processes the resulting content happily enough, right?

If that is right, that makes this a medium priority issue which we'll fix in due course by updating the content model for rPr. (Historically - until v3.3.8 - docx4j handled all AlternateContent by selecting the Fallback. )

plutext avatar Apr 21 '21 06:04 plutext

Thanks for responding and commenting on it :)

As far i can see: Yes, it ends up beeing xslt'ed and works fine afterwards.

Initially i found this because it crashed for me, but that was because my JAXBContext caching an Unwrapper per Thread to speed up things did not cleanly implement the binder. Which docx4j uses when using this fallback.

Using the binder is a performance nightmare i try to avoid at all cost. In my examples using the transformation fallback takes 8 to 30 times as long. But as for now i haven't observed this problem to occur often for me (in fact, it occured a first time for me after doing hundreds of tests). As rare as it seems, this extra time might just get lost in the noise of random extra computing time in places.

So, for now, medium priority also seems reasonable to me.

Should i find contrary examples, i will let you know it.

jhrtl avatar Apr 21 '21 08:04 jhrtl