pdf-issues icon indicating copy to clipboard operation
pdf-issues copied to clipboard

Several integer key values do not state explicit valid ranges

Open petervwyatt opened this issue 5 years ago • 9 comments

Several integer keys in dictionaries do not state any explicit valid ranges, such as "positive integer ..." or "non-negative integer ...".

One way to fix this quickly may be to simply state once up the front of ISO 32K somewhere (where?) that key values that represent counts, sizes, widths, heights, file byte offsets, object numbers, page numbers and <anything else that is common?> are non-negative unless stated otherwise.

Or we could review each and add the explicit wording in place.

Here is an incomplete list (from a search of ISO 32000-2 for "integer" up to about Table 100 - more to be added later):

  • Table 11 Columns: unstated non-negative
  • Table 11 Rows: unstated non-negative
  • Table 11 DamagedRowsBeforeError: unstated non-negative
  • Table 15: Size: unstated positive
  • Table 15: Prev: unstated non-negative or positive??
  • Table 16: N: unstated non-negative / positive??
  • Table 16: First: unstated non-negative / positive??
  • Table 17: Size: unstated > 1 (since has to be +1 on highest object# and PDFs need at least 1 object (technically more!))
  • Table 17: Prev: unstated non-negative or positive??
  • Table 19: XRefStm: unstated non-negative (just "byte offset")
  • Table 30: Count: unstated non-negative or positive?? (just "count of leaf nodes" - does it include Template pages?)
  • Table 31: StructParents: ?? is there a valid range ??
  • Table 45: Size: unstated non-negative
  • Table 87 (Image XObject): Width: unstated non-negative
  • Table 87 (Image XObject): Height: unstated non-negative
  • Table 87 (Image XObject): StructParent ?? is there a valid range ??
  • Table 93 (Form XObject): StructParent ?? is there a valid range ??
  • Table 87 (Form XObject): StructParents ?? is there a valid range ??
  • Table 95 (Reference dict): Page: unstated non-negative

petervwyatt avatar Jan 05 '21 22:01 petervwyatt

More:

  • Table 109, 110: FirstChar, LastChar : unstated non-negative ??
  • Table 114: Supplement - ?? nothing ??
  • Table 125: Length, Length2, Length3 : unstated non-negative
  • Table 129, 130: Width, Height : unstated positive
  • Table 131: Width, Height, Width2, Height2 : unstated positive
  • Table 147: NumCopies: unstated positive
  • Table 166: StructParent: ??

petervwyatt avatar Jan 06 '21 05:01 petervwyatt

More:

  • Table 232: MaxLen - is a 0-length text field useful or invalid???
  • Table 255: R, V - no info is given as to valid ranges for these versions
  • Table 255: Prop_AuthTime - assume 0 is valid so should be described as a "non-negative integer"?
  • Table 305: C, B - looks like 0 would be invalid so should be described as "positive integers"?

petervwyatt avatar Mar 23 '21 02:03 petervwyatt

Also Table 8: Columns (positive)?

It doesn't make any sense to allow a value of 0 here, does it? FWIW, I'll come out to admit that I managed to produce a segfault if 0 was given.

Similarly for the Columns entry in Table 11 (you wrote non-negative above)?

pesco avatar Jul 11 '22 18:07 pesco

Re Columns in Tables 8 and 11 possibly being zero: I was half thinking about the highly degenerate corner case situation where there is no data in the compressed images, in which case Columns being zero might be equally pointless/degenerate. But if we assume images always have at least 1 pixel then yes, Columns would be strictly greater than zero.

petervwyatt avatar Jul 13 '22 04:07 petervwyatt

How many rows of width zero does a zero-pixel image have? Allowing Columns = 0 begs the age-old question of what is 0 divided by 0.

However, a zero-pixel image with Columns any positive integer is perfectly possible. It simply has 0 rows of that size.

(I'm assuming this data is always stored in row-major order... is it?)

pesco avatar Jul 13 '22 19:07 pesco

Yes - I believe these compressors require row-major order. As I said I only half thought about it 😀... so maybe Columns should be defined to always be strictly greater than zero.

petervwyatt avatar Jul 14 '22 00:07 petervwyatt

BTW based on a recent PDF TWG discussion, we will switch to avoid the use of terms such as "positive", "negative" or "non-negative" because of regional differences of interpretation as to whether each of these includes or excludes zero. We'll go with explicit wording w.r.t to zero.

petervwyatt avatar Jul 14 '22 00:07 petervwyatt

More:

  • Table 307, Rate - a value of zero makes no sense as play speed... probably easiest to spec that 0 means default (1.0)
  • Table 316, F - has no specified limits???
  • Table 318, O - is opacity so shall be between 0.0 and 1.0 inclusive
  • Table 325, various keys which are numbers - are they all strictly > 0???
  • Tables 327, 328, 329, 330 - P is number of precision digits so "shall be" >= 0?
  • Tables 327, 328, 329, 330, 331 - the text height keys (TS) - shall be > 0 so that text is always visible albeit incredibly small? Or allow negative - is that mirrored vertically / upside down???
  • Table 330, EL - extension line length so has to be >= 0 ??

petervwyatt avatar Oct 13 '22 03:10 petervwyatt

More:

  • Table 337, PlayCount - does 0 make any sense?

petervwyatt avatar Oct 13 '22 06:10 petervwyatt