docx
docx copied to clipboard
Parse xml issue
I have 2 docx files, a file is generated by docx.js and another one is created by MS Words. This is my code to get number of pages of a docx file, but the file generated by docx.js does not work
const fs = require("fs");
const parseString = require("xml2js").parseString;
const toString = require("stream-to-string");
const unzipper = require("unzipper");
const getDocxPageCount = filePath => {
return new Promise((resolve, reject) => {
if (!fs.existsSync(filePath)) reject("Error in reading file");
fs.createReadStream(filePath)
.pipe(unzipper.Parse())
.on("entry", entry => {
if (entry.path == "docProps/app.xml") {
toString(entry).then(xml => {
parseString(xml, function (err, result) {
console.log({ result: result["Properties"] })
if (result["Properties"]["Pages"][0]) {
resolve(result["Properties"]["Pages"][0]);
} else {
reject("Cannot find page count");
}
});
});
} else {
entry.autodrain();
}
});
});
};
- This is result when i parse xml file generated by docx.js

- This is result when i parse xml file created by MS Words

Please take a look on them. Thanks a lot!
In order to calculate the number of pages, the document needs to be rendered; that's a complex thing that would require a lot of code and is currently not supported. In order to get the page count you'll need to first open it in a docx editor like word or libreoffice and resave it.
Do we have alternative solution to count docx pages (docx file generated by docx.js)? I want the same page number as on MS Words (page numbers on MS Words and LibreOfffice are different). @devoidfury
No, the only way would be to open it with word and resave it.
Thank you @devoidfury !