Open-XML-SDK icon indicating copy to clipboard operation
Open-XML-SDK copied to clipboard

ChangeDocumentType does not fully remove vbaProject reference

Open mc2002tii opened this issue 5 years ago • 6 comments

Description

I'm using the sample code at https://docs.microsoft.com/en-us/office/open-xml/how-to-convert-a-word-processing-document-from-the-docm-to-the-docx-file-format to remove macros from a docm file and convert it to docx because we have some filtering software in place that prevents transferring files with macros.

Using that sample code I delete the VbaProjectPart, change the document type, and change the file extension. However, our filtering software identifies the resulting file as corrupt (Word 2016 opens the file just fine though, so it is probably within spec).

When I examine the contents of the .docx file, I notice that the [Content_Types].xml file at the root still contains the following line: <Default ContentType="application/vnd.ms-office.vbaProject" Extension="bin"/>

The VbaProjectPart PartName reference is gone and no other content in the .docx file structure contains any macro components. I think that one line in [Content_Types].xml is enough to trip up our scanner.

Is there some other way to get rid of this line that I'm missing, is this a bug, or is this structure just something that our scanning software should accept?

Information

  • .NET Target: .NET Core 2.1.12
  • DocumentFormat.OpenXml Version: 2.10.0-beta0002

Repro

        bool fileChanged = false;

        using (WordprocessingDocument document = WordprocessingDocument.Open(sourcePath, true))
        {
            // Access the main document part.
            var docPart = document.MainDocumentPart;

            // Look for the vbaProject part. If it is there, delete it.
            var vbaPart = docPart.VbaProjectPart;
            if (vbaPart != null)
            {
                // Delete the vbaProject part and then save the document.
                docPart.DeletePart(vbaPart);
                docPart.Document.Save();

                // Track that the document has been changed.
                fileChanged = true;
            }

            // Change the document type to not macro-enabled
            document.ChangeDocumentType(WordprocessingDocumentType.Document);
        }

        if (fileChanged)
        {
            // If it already exists, it will be deleted!
            if (File.Exists(destinationPath))
                {
                    File.Delete(destinationPath);
                }

            // Rename the file and save changes
            Directory.CreateDirectory(destinationDirectory);
            File.Move(sourcePath, destinationPath);
        }

Observed

file.docx [Content_Types].xml still contains a macro reference.

Expected file.docx should not contain any references to macros.

mc2002tii avatar Aug 15 '19 16:08 mc2002tii

same issue here is there any solutions?

waizui avatar May 08 '20 10:05 waizui

@mc2002tii Can you include something I can repro?

twsouthwick avatar May 30 '20 00:05 twsouthwick

@mc2002tii Can you include something I can repro?

I'll have to test this when I'm back in the office next week. I couldn't reproduce it today, but at home I have a completely different environment (O365 vs Word 2016, Mac vs. Windows). I know I could still reproduce it with .NET Core 3.1 and DocumentFormat.OpenXML 2.10, but I don't think I tried again when 2.11 came out.

mc2002tii avatar Jun 01 '20 13:06 mc2002tii

I am facing this issue with .Net 6 DocumentFormat.OpenXML 2.16

prudhvi2050 avatar Aug 31 '22 13:08 prudhvi2050

related to issue 1551

AlfredHellstern avatar Jan 23 '24 19:01 AlfredHellstern

#1551

tomjebo avatar Jan 23 '24 19:01 tomjebo