pdf-issues icon indicating copy to clipboard operation
pdf-issues copied to clipboard

Microsoft Opentype 'cmap' deprecations - refresh PDF wording?

Open petervwyatt opened this issue 4 years ago • 5 comments

Referring to the Microsoft Opentype 'cmap' documentation at https://docs.microsoft.com/en-us/typography/opentype/spec/cmap:

Clause "9.6.5.4 Encodings for TrueType fonts" in ISO 32000-2:2020 makes explicit reference to various values for platform-specific encoding ID which are now explicitly marked as deprecated in the Microsoft Opentype documentation (URL above: "Use of encoding IDs 0, 1 or 2 is deprecated.", "For current Apple platforms, use of platform ID 1 is discouraged", ).

Although PDF processors are expected to encounter legacy PDFs with legacy embedded fonts, the current wording in ISO 32000-2 reads like these deprecated values are the only values expected. Maybe it is worth adding a note about the TrueType deprecations?

petervwyatt avatar Mar 13 '21 04:03 petervwyatt

Referring to the Microsoft Opentype 'cmap' documentation at https://docs.microsoft.com/en-us/typography/opentype/spec/cmap:

Clause "9.6.5.4 Encodings for TrueType fonts" in ISO 32000-2:2020 makes explicit reference to various values for platform-specific encoding ID which are now explicitly marked as deprecated in the Microsoft Opentype documentation (URL above: "Use of encoding IDs 0, 1 or 2 is deprecated.", "For current Apple platforms, use of platform ID 1 is discouraged", ).

Although PDF processors are expected to encounter legacy PDFs with legacy embedded fonts, the current wording in ISO 32000-2 reads like these deprecated values are the only values expected. Maybe it is worth adding a note about the TrueType deprecations?

According to 9.6.3 both TrueType and OpenType formats are supported:

...both the TrueType font format (see Apple Computer, Inc., TrueType Reference Manual) as well as the OpenType font format (as defined by ISO/IEC 14496-22).

The Microsoft OpenType reference quoted above matches ISO 14496 more closely than the Apple TrueType Reference. The clause "Use of encoding IDs 0, 1 or 2 is deprecated." in the Microsoft spec is not present in the Apple spec. Therefore it would be dangerous or even wrong to simply mirror Microsoft's deprecation note as it doesn't apply to TrueType fonts according to Apple's understanding.

t-merz avatar Dec 07 '21 14:12 t-merz

@t-merz - thanks for the analysis (and sorry for the long delay).

PDF 2.0 defines OpenType via a normative reference to ISO 14496-22. This standard now has 2 published Amendments since we published (and which are NOT incorporated by PDF 2.0 because we use a dated reference!) and a full dated revision is in progress according to https://www.iso.org/standard/74461.html.

I suggest we:

  1. get both Amendments to see if these have any potential impact on PDF 2.0
  2. ask ISO/IEC JTC 1/SC 29 about the Microsoft vs Apple difference, and for clarification.

Both of these would best be done via ISO TC 171 SC 2 WG 8.

petervwyatt avatar Oct 16 '23 02:10 petervwyatt

PDF TWG agree that the boilerplate text for dated revisions in clause 2 should state "incl. amendments".

petervwyatt avatar Oct 16 '23 18:10 petervwyatt

A discussion has been opened up with ISO JTC 1 SC 29 experts to discuss the forthcoming dated revision of ISO 14496-22 and how this might impact PDF 2.0. The initial recommendation is for PDF to switch away from ISO and to use the Microsoft OpenType spec.

Re-opening this errata to track the progress of these discussions with ISO JTC 1 SC 29 experts in light of deprecated "cmap" table and potentially new conflicting tables (e.g. "COLR")

petervwyatt avatar Oct 25 '23 00:10 petervwyatt

Feedback from ISO JTC 1 SC 29 experts:

  • PDF spec should probably reference ISO/IEC 14496-22 generically (non-dated) form (and also generically refer to amendments)

  • The question of what “deprecation” means is not sufficiently addressed, but from a practical standpoint in OFF & OpenType, deprecation basically means “it’s documented here for posterity but new creations should not use it”. It’s not the entire ‘cmap’ table that is deprecated; rather, some numeric identifier values for some fields within the cmap. Additionally, “deprecation” does not imply impending removal. What’s unclear is whether it means eventual removal and what the implications are for things that refer to it.

ISO TC 171 SC 2 also hopes that ISO JTC 1 SC 29 can clarify their meaning of "deprecation" in a future ISO 14496-22.

This issue will be parked until ISO 14496-22 dated revision is published, and ISO 32000-2 will need to be updated.

petervwyatt avatar Oct 31 '23 22:10 petervwyatt