robotframework-doctestlibrary
robotframework-doctestlibrary copied to clipboard
Question: is it feasible to compare two MS office documents?
As the above title suggests, I wonder whether it is feasible to compare the contents of two MS office documents like word or ppt and get the location of the difference, and then output a comparison picture containing the found difference.
Any response will be preferred. Thanks in advance.
Yes, this should be possible. But as the library is focused on .PDF or Image comparisons, it would mean we need to convert those .pptx or .docx files to PDF (or PNG) first. (I would recommend .PDF).
To try it yourself:
- Export the .pptx file in PowerPoint as .pdf (FILE > Export > Create PDF)
- Add a small change and export again as .pdf
- Compare both .pdf files using the library
Thank you very much@manykarim Although there should be some ways to automatically convert Office documents into PDF files, which would then help us utilize the library, maybe it would be helpful and enrich the library's function if we can compare two Office documents.
But perhaps there is a certain difficulty when drawing a rectangle onto a Word document since it would change the layout of content, so maybe PDF is a better option in such a case.
I could think about it. However there are already libraries o there to do the conversion from e.g. word to PDF. E.g. https://rpaframework.org/libraries/word_application/ Maybe it's worth checking those out first. I want to avoid some parallel/double development there
Also this approach using pure python looks simple.. https://stackoverflow.com/questions/6011115/doc-to-pdf-using-python