Open-XML-SDK icon indicating copy to clipboard operation
Open-XML-SDK copied to clipboard

Add Check for empty parts to OpenXmlValidator

Open mikeebowen opened this issue 7 months ago • 4 comments

  • closes #816

The issue was not that there was no way to check for a minimally valid document, the issue was that the validator considered completely empty parts to be valid. So this issue was resolved by adding a check for empty parts to the OpenXmlValidator.

mikeebowen avatar Apr 23 '25 18:04 mikeebowen

Test Results

    55 files   -  3      55 suites   - 3   54m 56s ⏱️ +22s  2 061 tests + 1   2 058 ✅ + 1   3 💤 ±0  0 ❌ ±0  32 301 runs   - 24  32 265 ✅  - 24  36 💤 ±0  0 ❌ ±0 

Results for commit 25e5c67b. ± Comparison against base commit a073d057.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Apr 23 '25 19:04 github-actions[bot]

@twsouthwick

Be overrideable/replaceable

Created virtual method in OpenXmlPackage

Be accessed off of OpenXmlPackage directly rather than implemented separately

There is a default implementation in OpenXmlPackage but the individual Wordprocessing, Spreadsheet, and Presentation Documents have unique implementations for their different requirements.

Should this be a separate API? Or should it be opt in while opening a package? I tend not to be a fan of APIs that check if something is valid then doing it - there's always the potential something gets changed underneath

Added VerifyMinimumPackage option to OpenSettings when true, the package contents are examined and throw an exception if minimum package requirements are not met

Do we really want to validate the extension? What about if it's a stream/package? If it is overrideable, we could have a check if we've opened it with a path to validate that if needed

The minimum package is verified on all Open overloads with VerifyMinimumPackage true. The Open method handles exceptions for the file path such as invalid extension or invalid characters, so with the new API validating the path is not necessary.

Are we OK with it throwing an exception if it is an ill-formed package (i.e. we already do that)?

Removed file checks. The Open method already handles those.

mikeebowen avatar May 06 '25 16:05 mikeebowen

@mikeebowen this would be a good use case for a feature so it can be overridden at runtime vs compile time. Here's an example of how to do it: https://github.com/dotnet/Open-XML-SDK/commit/fa893dd6686124d2490ba92f7178b64308448698

twsouthwick avatar May 12 '25 17:05 twsouthwick

A few questions about the direction/goal here

Only a few kinds of packages are supported (such as in the presentation doc) - it will fail if it's an unsupported one

Is this what we want? I'd expect it to validate it but fail if it is invalid. It appears to also fail if we don't have a check which seems a little unexpected.

Validate() throws and returns true/false

I'd expect it to do one or the other. Is the manual throwing due to custom messages? If so, let's make it consistent

Mixture of exception types

I think we should use an existing OpenXmlPackageException type or create a new one for this

How to think about this vs validation

At some point, how is this different from the existing OpenXmlValidator logic? Should we be hooking this up to that and performing a full validation?

twsouthwick avatar May 13 '25 00:05 twsouthwick