web-monitoring-ui icon indicating copy to clipboard operation
web-monitoring-ui copied to clipboard

Render word docs using Google or Microsoft embedded renderers

Open Mr0grog opened this issue 7 years ago • 2 comments

This is kind of related to #179: like PDF and other non-text file formats, we can’t diff MS Word documents. BUT! Both Google and Microsoft offer iframe-embeddable renderers for Word docs, so we could use that to display the contents of the file, even if we can’t diff it.

Google: https://docs.google.com/gview?url=https://edgi-versionista-archive.s3.amazonaws.com/versionista2/74286-6216580/version-14182260.doc&embedded=true

https://docs.google.com/gview?url={URL here}&embedded=true

Microsoft: https://view.officeapps.live.com/op/embed.aspx?src=https://edgi-versionista-archive.s3.amazonaws.com/versionista2/74286-6216580/version-14182260.doc

https://view.officeapps.live.com/op/embed.aspx?src={URL here}

We should see if these viewers work for Powerpoint and Excel files, too.

And of course we should also see if we can figure out a way to actually diff them, but this is an easy short term solution that’s better than displaying nothing at all.

Mr0grog avatar Jan 19 '18 07:01 Mr0grog

See also this Stack Overflow thread: https://stackoverflow.com/questions/27957766/how-do-i-render-a-word-document-doc-docx-in-the-browser-using-javascript

Mr0grog avatar Jan 19 '18 07:01 Mr0grog

For this, you’ll probably want to create a new view that renders a word document using one of the above methods. See SandboxedHtml for an example, although this view will hopefully be much simpler.

Then modify RawVersion.render() and SideBySideRawVersions.renderVersion() to use that view based on the media type of the version you are rendering.

Check out ChangeView. mediaTypeForVersion() to see how to determine the media type for a version object. (In the future, we hope have an actual media type field on version objects, but that’s not done yet — see edgi-govdata-archiving/web-monitoring-db#199)

Mr0grog avatar Mar 16 '18 23:03 Mr0grog