docx icon indicating copy to clipboard operation
docx copied to clipboard

Parse xml issue

Open caotrongtin99 opened this issue 3 years ago • 4 comments

I have 2 docx files, a file is generated by docx.js and another one is created by MS Words. This is my code to get number of pages of a docx file, but the file generated by docx.js does not work

const fs = require("fs");
const parseString = require("xml2js").parseString;
const toString = require("stream-to-string");
const unzipper = require("unzipper");

const getDocxPageCount = filePath => {
  return new Promise((resolve, reject) => {
    if (!fs.existsSync(filePath)) reject("Error in reading file");
    fs.createReadStream(filePath)
      .pipe(unzipper.Parse())
      .on("entry", entry => {
        if (entry.path == "docProps/app.xml") {
          toString(entry).then(xml => {
            parseString(xml, function (err, result) {
              console.log({ result: result["Properties"] })
              if (result["Properties"]["Pages"][0]) {
                resolve(result["Properties"]["Pages"][0]);
              } else {
                reject("Cannot find page count");
              }
            });
          });
        } else {
          entry.autodrain();
        }
      });
  });
};
  • This is result when i parse xml file generated by docx.js
Screen Shot 2021-12-03 at 11 50 16 AM
  • This is result when i parse xml file created by MS Words
Screen Shot 2021-12-03 at 11 51 42 AM

Please take a look on them. Thanks a lot!

caotrongtin99 avatar Dec 03 '21 04:12 caotrongtin99

In order to calculate the number of pages, the document needs to be rendered; that's a complex thing that would require a lot of code and is currently not supported. In order to get the page count you'll need to first open it in a docx editor like word or libreoffice and resave it.

devoidfury avatar Dec 03 '21 20:12 devoidfury

Do we have alternative solution to count docx pages (docx file generated by docx.js)? I want the same page number as on MS Words (page numbers on MS Words and LibreOfffice are different). @devoidfury

caotrongtin99 avatar Dec 06 '21 02:12 caotrongtin99

No, the only way would be to open it with word and resave it.

devoidfury avatar Dec 06 '21 03:12 devoidfury

Thank you @devoidfury !

caotrongtin99 avatar Dec 06 '21 03:12 caotrongtin99