Open-XML-SDK
Open-XML-SDK copied to clipboard
DocumentType Strict does not exist
Description
DocumentType Strict does not exist if you want to change document type. See available options for changing document types for spreadsheets:
https://docs.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheetdocumenttype?view=openxml-2.8.1
Information
- DocumentFormat.OpenXml Version: (ie 2.7.2)
Repro
// Please add a self-contained, minimum viable repro of the issue.
// If you require external resources, please provide a gist or GitHub repro
// An Xunit style test is preferred, but a console application would work too.
Observed
No Strict DocumentType exist for changing DocumentType.
Expected
You can save files as Strict in the file type dropdown menu in Office, but you cannot select Strict as a DocumentType in the SDK.
Hi @Asbjoedt ,
What are you trying to do? Are you trying to create documents with the SDK and set them as strict mode? If you are trying to create documents with the SDK, could you post the code you are trying?
Hi @mikeebowen
Are you trying to create documents with the SDK and set them as strict mode?
Yes! I asked the same question on Stackoverflow 3 months ago, but I have received no replies: https://stackoverflow.com/questions/71940270/how-to-programmatically-create-ooxml-strict-documents
If you are trying to create documents with the SDK, could you post the code you are trying? I have tried two codes.
Code 1. Convert to DocumentType Workbook and do nothing else. "ChangeDocumentType" -> "Workbook" defaults to Transitional conformance, which is fine. But how to make it Strict?
using using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
byte[] byteArray = File.ReadAllBytes(input_filepath);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (SpreadsheetDocument spreadsheetDoc = SpreadsheetDocument.Open(stream, true))
{
spreadsheetDoc.ChangeDocumentType(SpreadsheetDocumentType.Workbook);
}
File.WriteAllBytes(output_filepath, stream.ToArray());
}
Code 2. Convert using SaveAs(). This (makes sense) also defaults to Transitional (however when you input a Strict file to convert it to Transitional, the new file will create an OOXML validator error related to namespaces (I will test this bug again and create a new issue for it, if it is repro))
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using (SpreadsheetDocument spreadsheetDoc = SpreadsheetDocument.Open(input_filepath, true))
{
SpreadsheetDoc.SaveAs(output_filepath)
}
What do you do to transform a Transitional document preferably a spreadsheet from Transitional to Strict conformance? In Office it is as simple as selecting Strict file format. In SDK, this option does not exist, and any conformance transform would have to be coded manually. In Office Strict conformance can only be selected for non-macro and non-template file formats meaning (docx, xlsx, pptx) and not e.g. xltx, xlsm etc. These should have separate "ChangeDocumentType" options also, or another way of specifying conformance when using "ChangeDocumentType".
Does this make sense?
It is a big issue for us as an archive, that we do not have a programmatic way to transform Transtional to Strict conformance (I am not a programmer, but I have not found any libraries or programs that can help us do this).
Thanks for listening.
Hi @Asbjoedt ,
Unfortunately, the StrictRelationshipFound property is read only, so it can be used to create a report, but it can't edit the attribute to make it strict. I'm not sure if it's possible another way, but I will look into it.
Hi again
Regarding spreadsheets.
Changing conformance class can be done with:
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using (var spreadsheet = SpreadsheetDocument.Open(input_filepath, true))
{
Workbook workbook = spreadsheet.WorkbookPart.Workbook;
if (workbook.Conformance == null || workbook.Conformance != "strict")
{
workbook.Conformance.Value = ConformanceClass.Enumstrict;
}
}
Changing from Transitional to Strict involves several other things. Importantly, it involves changing many namespaces. Here's a non-exhaustive (incomplete) list of namespaces for Strict, which for every single one must be changed in the spreadsheet.
http://purl.oclc.org/ooxml/spreadsheetml/main
http://purl.oclc.org/ooxml/officeDocument/relationships
http://purl.oclc.org/ooxml/officeDocument/extendedProperties
http://purl.oclc.org/ooxml/officeDocument/docPropsVTypes
http://purl.oclc.org/ooxml/officeDocument/relationships/styles
http://purl.oclc.org/ooxml/officeDocument/relationships/theme
http://purl.oclc.org/ooxml/officeDocument/relationships/worksheet
http://purl.oclc.org/ooxml/officeDocument/relationships/sharedStrings
http://purl.oclc.org/ooxml/officeDocument/relationships/externalLink
http://purl.oclc.org/ooxml/officeDocument/relationships/officeDocument
http://purl.oclc.org/ooxml/officeDocument/relationships/externalLinkPath
http://purl.oclc.org/ooxml/officeDocument/relationships/oleObject
http://purl.oclc.org/ooxml/drawingml/main
http://purl.oclc.org/ooxml/drawingml/spreadsheetDrawing
urn:schemas-microsoft-com:vml
I am trying to write my own code to amend all these namespaces but it is a manual and time-consuming process.
Changing from Transitional to Strict also means not accepting the old drawing VML but instead accepting DrawingML and not accepting certain Japanese characters.
Basically my company needs an easy way to change from Transitional to Strict. It can be done with Excel Interop dependency using this code, but we want to use a self-dependent application only using Open XML SDK:
Excel.Application app = new Excel.Application(); // Create Excel object instance
Excel.Workbook wb = app.Workbooks.Open(input_filepath); // Create workbook instance
wb.SaveAs(output_filepath, 61); // Save workbook as .xlsx Strict
wb.Close(); // Close workbook
app.Quit(); // Quit Excel application
If Open XML SDK could support an easy way to do all of the above-mentioned changes all at once with one or a few lines of code, this would meet an important business requirement for long-term archiving/preservation of spreadsheets.
There doesn't seem to be anything specifically available to do this OOB. However, I bet you could customize the IOpenXmlNamespaceResolver to do this for you. This is an internal type, so you'll need to clone the repo to try it out.
Some pseudo code that will most definitely not compile and should only set you on the right track, would be:
using var doc = SpreadsheetDocument.Open(...);
doc.ForceStrictNamespaces();
doc.Save(...)
public static void ForceStrictNamespaces(this OpenXmlPackage package)
{
var existing = doc.Features.Get<IOpenXmlnamespaceResolver>());
doc.Features.Set<IOpenXmlNamespaceResolver>(new ForceStrictNamespace(existing));
}
class ForceStrictNamespace : IOpenXmlNamespaceResolver
{
public ForceStrictNamespace(IOpenXmlNamespaceResolver other)
{
...
}
... // Implement it to give back that strict namespaces
}
Happy to take a PR that enables this :) Please let us know any design proposals before pushing a PR
Hi @twsouthwick
Thanks for your reply and the guidance in how to move forward. I cloned the repo and looked at this code briefly. I don't think I can deliver any code to you in a level acceptable, because I have only learned to program recently and these concepts with Interfaces is still difficult for me to write.
I can propose the following code (of which some is pseudo-code because I need you to point me in the right direction).
I propose to make a list of items each with prefix, transitional namespace and strict namespace. This is a kind of dictionary. I show the code below.
public class namespaceIndex
{
public string Prefix { get; set; }
public string Transitional { get; set; }
public string Strict { get; set; }
public static List<namespaceIndex> Create_Namespaces_Index()
{
List<namespaceIndex> list = new List<namespaceIndex>();
// xmlns
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/spreadsheetml/2006/main", Strict = "http://purl.oclc.org/ooxml/spreadsheetml/main" });
// docProps/
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/extended-properties", Strict = "http://purl.oclc.org/ooxml/officeDocument/extendedProperties" });
// docProps/vt
list.Add(new namespaceIndex() { Prefix = "vt", Transitional = "http://purl.oclc.org/ooxml/officeDocument/docPropsVTypes", Strict = "http://purl.oclc.org/ooxml/officeDocument/docPropsVTypes" });
// relationships/r
list.Add(new namespaceIndex() { Prefix = "r", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships" });
// relationship/styles
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/styles" });
// relationship/theme
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/theme" });
// relationship/worksheet
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/worksheet" });
// relationship/sharedstrings
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/sharedStrings" });
// relationship/externalLink
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/externalLink", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/externalLink" });
// relationship/officeDocument
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/officeDocument" });
// relationship/externallink/externalLinkPath
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/externalLinkPath", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/externalLinkPath" });
// relationship/oleObject
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/oleObject" });
// relationship/image
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/image", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/image" });
// relationship/video
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/video", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/video" });
// relationship/pivotCacheDefininition
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/pivotCacheDefinition", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/pivotCacheDefinition" });
// relationship/pivotCache Records
list.Add(new namespaceIndex() { Prefix = "", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/pivotCacheRecords", Strict = "http://purl.oclc.org/ooxml/officeDocument/relationships/pivotCacheRecords" });
// drawingml/a
list.Add(new namespaceIndex() { Prefix = "a", Transitional = "http://schemas.openxmlformats.org/drawingml/2006/main", Strict = "http://purl.oclc.org/ooxml/drawingml/main" });
// drawingml/xdr
list.Add(new namespaceIndex() { Prefix = "xdr", Transitional = "http://schemas.openxmlformats.org/drawingml/2006/spreadsheetDrawing", Strict = "http://purl.oclc.org/ooxml/drawingml/spreadsheetDrawing" });
// drawingml/chart
list.Add(new namespaceIndex() { Prefix = "c", Transitional = "http://schemas.openxmlformats.org/drawingml/2006/chart", Strict = "http://purl.oclc.org/ooxml/drawingml/chart" });
// customXml/ds
list.Add(new namespaceIndex() { Prefix = "ds", Transitional = "http://schemas.openxmlformats.org/officeDocument/2006/customXml", Strict = "" });
// urn for Strict - NO NAMESPACE FOR TRANSITIONAL
list.Add(new namespaceIndex() { Prefix = "v", Transitional = "", Strict = "urn:schemas-microsoft-com:vml" });
// docProps/core.xml - NO NAMESPACE FOR TRANSITIONAL
list.Add(new namespaceIndex() { Prefix = "dc", Transitional = "", Strict = "http://purl.org/dc/elements/1.1/" });
// docProps/core.xml - NO NAMESPACE FOR TRANSITIONAL
list.Add(new namespaceIndex() { Prefix = "dcterms", Transitional = "", Strict = "http://purl.org/dc/terms/" });
// docProps/core.xml - NO NAMESPACE FOR TRANSITIONAL
list.Add(new namespaceIndex() { Prefix = "dcmitype", Transitional = "", Strict = "http://purl.org/dc/dcmitype/" });
return list;
}
}
I then propose to make a method that can open a spreadsheet, register all existing namespaces in a dictionary and then search in our above-mentioned list of namespaces for any namespace to convert to.
I show the code below. However, two components are pseudo code.
public void Convert_Transitional_to_Strict(string input_filepath)
{
// Create list of namespaces
List<namespaceIndex> namespaces = namespaceIndex.Create_Namespaces_Index();
using (var spreadsheet = SpreadsheetDocument.Open(input_filepath, true))
{
WorkbookPart wbPart = spreadsheet.WorkbookPart;
DocumentFormat.OpenXml.Spreadsheet.Workbook workbook = wbPart.Workbook;
// If Transitional
if (workbook.Conformance == null || workbook.Conformance != "strict")
{
// Create dictionary of existing namespaces
Dictionary<string, string> existingNamespaces = spreadsheet
.GetAllParts()
.NamespaceDeclarations
.ToDictionary();
// Change conformance class
workbook.Conformance.Value = ConformanceClass.Enumstrict;
// Add vml urn namespace to workbook.xml
workbook.AddNamespaceDeclaration("v", "urn:schemas-microsoft-com:vml");
// Change namespaces
foreach (KeyValuePair<string, string> existingNamespace in existingNamespaces)
{
foreach (namespaceIndex name in namespaces)
{
if (name.Transitional == existingNamespace.Value)
{
existingNamespace.Value = name.Strict;
}
}
}
}
}
}
The below part is pseudo code. I don't know any way to search for all namespaces in a spreadsheet and register them to a dictionary. Using LINQ could be the way to move forward. Do you have any suggestion for me how to code a dictionary with all namespaces in a spreadsheet?
// Create dictionary of existing namespaces
Dictionary<string, string> existingNamespaces = spreadsheet
.GetAllParts()
.NamespaceDeclarations
.ToDictionary();
Secondly, the below is also pseudo code. I need a simple way to get and set any namespace registered in the dictionary of existing namespaces in the spreadsheet. I know of .AddNamespaceDeclations() and .RemoveNamespaceDeclarations(), but these must be hardcoded for each namespace, you want to change. I need a more generic way to do this in a foreach loop.
if (name.Transitional == existingNamespace.Value)
{
existingNamespace.Value = name.Strict;
}
Are these two pseudo code snippets, something that is possible to do with Open XML SDK and can you point me in the right direction?