Open-XML-SDK
Open-XML-SDK copied to clipboard
Schema Error when order of w:b and w:i is switched inside w:rPr
Before submitting an issue, please fill this out
Is this a:
- [x] Issue with the OpenXml library
- [ ] Question on library usage
---------------- Remove this line and above before posting ----------------
Description
When validating a docx
file that has <w:rPr>
with <w:i/>
and <w:b/>
, the error is thrown if <w:i/>
appears before <w:b/>
.
Information
- .NET Target: .NET Core 3.1
- DocumentFormat.OpenXml Version: 2.11.0
Repro
<w:r w:rsidRPr="0009578B">
<w:rPr>
<w:i/>
<w:b/>
<w:lang w:val="en-US"/>
</w:rPr>
<w:t>Test</w:t>
</w:r>
Observed
The following demo.docx file throws error on the validation while it still opens in Word and shows correctly formatted text.
Expected
The file should pass validation.
I confirm. The same is for <font>
for spreadsheets.
Actually the error is thrown in all cases except one when attributes are located in the order like in the code here: https://raw.githubusercontent.com/OfficeDev/Open-XML-SDK/master/src/DocumentFormat.OpenXml/GeneratedCode/schemas_openxmlformats_org_spreadsheetml_2006_main.g.cs , lines 17028 (for rPr) and 28814 (for font).
We can see there the same code in both cases:
builder.AddChild<Bold>();
builder.AddChild<Italic>();
builder.AddChild<Strike>();
builder.AddChild<Condense>();
builder.AddChild<Extend>();
builder.AddChild<Outline>();
builder.AddChild<Shadow>();
builder.AddChild<Underline>();
builder.AddChild<VerticalTextAlignment>();
builder.AddChild<FontSize>();
builder.AddChild<Color>();
builder.AddChild<RunFont>();
builder.AddChild<FontFamily>();
builder.AddChild<RunPropertyCharSet>();
builder.AddChild<FontScheme>();
and in the case of another attributes order validator throws errors.
Say, I have in xl/styles.xml this:
<font>
<name val="Arial"/>
<charset val="1"/>
<family val="2"/>
<sz val="10"/>
</font>
and it throws:
The element has unexpected child element 'http://schemas.openxmlformats.org/spreadsheetml/2006/main:family'.
but when I change it to
<font>
<sz val="10"/>
<name val="Arial"/>
<family val="2"/>
<charset val="1"/>
</font>
the error is gone.
@filipkis @zgordan-vv this is interesting. The schemas that we use for the generation of the framework look like this:
<xsd:group name="EG_RPrBase" ...
**<xsd:sequence>**
<xsd:element name="rStyle"
<xsd:element name="rFonts"
<xsd:element name="b" ...
whereas the ISO 29500 defined schemas from Part 1 Annex A look like this:
<xsd:group name="EG_RPrBase">
**<xsd:choice>**
<xsd:element name="rStyle" type="CT_String"/>
<xsd:element name="rFonts" type="CT_Fonts"/>
<xsd:element name="b" type="CT_OnOff"/>
<xsd:element name="bCs" type="CT_OnOff"/>
<xsd:element name="i" type="CT_OnOff"/>
... This would explain why Office and the SDK framework treats the rPr elements as respecting an order. The OpenXML SDK framework was built originally (and still is) to describe the behavior of Office not just the standard. Initial search in [MS-OI29500] (our implementation notes), I don't see any notes on this deviation although there may be something. Since our validation would be based on Office schemas, not on ISO schemas, I think we wouldn't change this but could consider an enhancement or feature to add ISO validation.
It would be good to get a clarification on this. I think everyone wants to produce files that are valid in the MS Office suite, but this kind of discrepancy makes it challenging.
ISO validation would silence such problems possibly, but might not work in Office if the information is not compatible. In this case choice/sequence can produce quite different information. Perhaps the office schema is an implementation decision, since the xsd:choice has fewer options than if it would be a sequence?
This is essentially a request to implement the SDK validation and classes based on ISO standard vs Office. There may be some need to add an implementation note in MS-OI29500 (I'll follow up on that separately) but ISO implementation is a potential future project we have added to the projects board: https://github.com/dotnet/Open-XML-SDK/projects/3
Closing.