docx4j-search-and-replace-util icon indicating copy to clipboard operation
docx4j-search-and-replace-util copied to clipboard

File corrupted after text replace

Open metallica33 opened this issue 2 years ago • 7 comments

I am using this utility to replace the text in the docx file. If I try to save the docx file then it will not open in the MS Word. It gives an error as shown in the attached screenshot. But converting to PDF using Docx4j.toPDF creates the PDF file correctly.

Here is the code -

try {
    InputStream templateInputStream = new FileInputStream("C:/Documents/original.docx");
    WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(templateInputStream);
    String regex = ".*(calibri|cour|arial|times|comic|georgia|impact|LSANS|pala|tahoma|trebuc|verdana|symbol|webdings|wingding).*";
    PhysicalFonts.setRegex(regex);
    Map < String, String > replaceMap = new HashMap < String, String > ();
    replaceMap.put("<<user_name>>", "Jon Doe");
    replaceMap.put("<<user_email>>", "[email protected]");
    Docx4JSRUtil.searchAndReplace(wordMLPackage, replaceMap); //If I comment this, then the docx file is saved correctly
    MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
    FileOutputStream pdfOs = new FileOutputStream("C:/Documents/document.pdf");
    FileOutputStream docxOs = new FileOutputStream("C:/Documents/document.docx");
    Docx4J.save(wordMLPackage, docxOs); //docx file saved does not open in MS Word
    Docx4J.toPDF(wordMLPackage, pdfOs); //PDF file is created correctly	            
    pdfOs.flush();
    pdfOs.close();
    docxOs.flush();
    docxOs.close();
} catch (Docx4JException | IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}
Screenshot 2023-03-24 224328

metallica33 avatar Mar 24 '23 17:03 metallica33

Hi @metallica33 , I never experienced this.

  • Are you using an old or a modern version of MS word?
  • Are you using the latest version of this library?

phip1611 avatar Mar 24 '23 17:03 phip1611

docx4j-search-and-replace-util - v1.0.7 docx4j - v8 MS Word 365

metallica33 avatar Mar 25 '23 03:03 metallica33

Could you share the docx file with me and give me instructions for a minimal producer, please?

phip1611 avatar Mar 25 '23 21:03 phip1611

I have attached the files here. The document.docx is the file which is saved after conversion. Just run the code provided earlier and it will create the documen.pdf and document.docx files.

document.docx document.pdf original.docx

metallica33 avatar Mar 26 '23 03:03 metallica33

I did not have time so far to look into this, sorry. I hope sometime in the next few days.

phip1611 avatar Apr 17 '23 11:04 phip1611

I was facing this same issue. I was using docx4j v6.1.2. After a couple of google searches, I ended up using these libraries:

<dependency>
	<groupId>org.docx4j</groupId>
	<artifactId>docx4j-core</artifactId>
	<version>8.3.9</version>
</dependency>
<dependency>
	<groupId>org.docx4j</groupId>
	<artifactId>docx4j-export-fo</artifactId>
	<version>8.3.9</version>
</dependency>
<dependency>
	<groupId>org.docx4j</groupId>
	<artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
	<version>8.3.9</version>
</dependency>
<dependency>
	<groupId>jakarta.xml.bind</groupId>
	<artifactId>jakarta.xml.bind-api</artifactId>
	<version>4.0.0</version>
</dependency>
<dependency>
	<groupId>org.glassfish.jaxb</groupId>
	<artifactId>jaxb-runtime</artifactId>
	<version>4.0.3</version>
</dependency>
<dependency>
	<groupId>de.phip1611</groupId>
	<artifactId>docx4j-search-and-replace-util</artifactId>
	<version>1.0.7</version>
</dependency>

ThiagoDosSantos avatar Jul 20 '23 17:07 ThiagoDosSantos

Sorry, I don't have the capacity to investigate this.

phip1611 avatar Jan 19 '24 15:01 phip1611