Open-XML-SDK Strongly typed classes define all properties as nullable

Description

Using the generated Sheet class as an example, it seems that (pretty much all) the strongly typed classes define all their properties as nullable. For example:

public StringValue? Name
{
    get => GetAttribute<StringValue>();
    set => SetAttribute(value);
}

This is done even where the schema defines the corresponding attributes as required, meaning they will (or should) not be null. For example:

<xsd:complexType name="CT_Sheets">
  <xsd:sequence>
    <xsd:element name="sheet" type="CT_Sheet" minOccurs="1" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_Sheet">
  <xsd:attribute name="name" type="s:ST_Xstring" use="required"/>
  <xsd:attribute name="sheetId" type="xsd:unsignedInt" use="required"/>
  <xsd:attribute name="state" type="ST_SheetState" use="optional" default="visible"/>
  <xsd:attribute ref="r:id" use="required"/>
</xsd:complexType>

Flagging all properties as nullable is at least not helpful. It would be better to have Name, SheetId, and Id not nullable and only State nullable, meaning the schema would be reflected in the nullability of the properties. Otherwise, you'll have to add countless ! to tell the compiler that everything is fine.

Information

.NET Target: all
DocumentFormat.OpenXml Version: latest

Observed

Nullability is not in line with schema.

Expected

Nullability should be in line with schema.

Jul 09 '21 17:07 ThomasBarnekow

Ooh that's interesting. The challenge is that from an API perspective, they may be null since we're creating from scratch and it's not available. It would be great to swap things out for records and init properties, but that would be a big change.

What about DisallowNullAttribute on each of the required ones? That way it still provides the benefit that you should check if it's null, but it doesn't allow you to set it to null.

Jul 23 '21 00:07 twsouthwick

Just as a thought: is it a good idea to enable nullability if you can't make the right assertions?

Maybe I interpreted this nullability thing wrong, but I would not (want to) perform null checks when nullability is enabled and a value can't be null. If I had to perform null checks regardless, I would not see much, if any, value in enabling nullability.

Jul 23 '21 09:07 ThomasBarnekow

I think you're viewing it as a potential semantic constraint of what the structure of the document requires/has as optional. Nullability is definitely not a cure all for all those semantic constraints. Here it is helpful as you are building the document and you may not have actually set a value yet. Yes, you should have certain required ones (the DisallowNullAttribute prevents you from settings something to null), but we can't assert by the type system that they are actually there.

The main benefit of nullability annotations are to tell you where you need to check for null, and once you've done it, the compiler tracks it for further usage areas to ensure you've checked on all applicable paths. I envision a few iterations on the SDK to get to a place where the nullability attributes are as useful as they could be.

Jul 23 '21 19:07 twsouthwick

The main benefit of nullability annotations is to tell you where you need to check for null, and once you've done it, the compiler tracks it for further usage areas to ensure you've checked on all applicable paths.

I should not have to check for null where the schema does not allow null values. When loading the DOM tree from a valid document, any required values (e.g., the name of a sheet as shown by my initial example above) will not be null. Since the API flags the name as nullable, however, I will have to check for null regardless. The reason is that the API does not currently ensure the validity of the Open XML markup, e.g., by adding required attributes as constructor arguments. As long as that is not done, those nullability flags will lead to many "false positives".

Jul 28 '21 22:07 ThomasBarnekow

That would be nice but, pretty sure the reason for this is backwards compatibility. If they make them non-nullable, then there would be problems compiling older code with nullability enabled. Also, even though the specification says they are required, does that really mean there is not any circumstance under which they could be null?

Aug 20 '21 22:08 SoftCircuits

They could definitely be null, hence why they are marked that way. For instance, if we open a malformed document it may be null because the value isn't there. Nullability is not a replacement for semantic validation, but can sometimes overlap.

Happy to explore and proposals of how the constraints of ensuring backwards compatibility and making the nullability annotations more helpful can be balanced. For example, we could potentially

Add a method to ensure required properties are set (this could then flow information to the compiler via the MemberNotNullWhen attribute)
Add DissallowNull to the required ones so they can't be removed

Aug 30 '21 20:08 twsouthwick

Open-XML-SDK Open-XML-SDK copied to clipboard

Strongly typed classes define all properties as nullable

Open-XML-SDK
Open-XML-SDK copied to clipboard