freeze-dry icon indicating copy to clipboard operation
freeze-dry copied to clipboard

Fix charset encoding of framed documents

Open Treora opened this issue 4 years ago • 0 comments

Like issue #29, but for subdocuments inside frames. As remarked here:

        get blob() { return new Blob([this.string], { type: 'text/html' }) },
        get string() {
            // TODO Add <meta charset> if absent? Or html-encode characters as needed?
            return documentOuterHTML(clonedDoc)
        },

The same applies to crawl-subresources for frames whose inner document we cannot access directly.

It seems new Blob() always utf-8-encodes given strings (mdn). I suppose we should either add <meta charset="utf-8"> to the DOM before running documentOuterHTML. Alternatively, we change the blob’s MIME type to text/html;charset=utf-8; something we could not do for the top-level document — might that be ‘cleaner’?

Problem observed in the wild.

Treora avatar Mar 11 '20 16:03 Treora