pdf-issues
pdf-issues copied to clipboard
32004 imprecisions
In the context of finding a wording recommendation to resolve #475, two possible imprecisions in ISO/TS 32004 became apparent:
-
Unclear meaning of bit 13 in standard security handler user access permissions: The TS only changes ISO 32000-2 table 22 by adding a row for bit 13 to indicate a MAC token requirement, see ISO/TS 32004 section 5.1.2. It does not change the row for bit positions 13-32 which requires a value 1. Thus, there essentially are two mappings for bit 13 in that table after applying the ISO/TS 32004 changes, the new one (allowing 0 or 1 to indicate a MAC token requirement) and the old one (allowing only 1 and being reserved).
This can be resolved by additionally changing the row for bits 13-32 to only apply to bits 14-32.
(One option discussed to resolve #475 would be to add a similar row for bits 13-32 to ISO 32000-2 table 24; if that option is eventually chosen, ISO/TS 32004 section 5.1.3 will have to be similarly extended to restrict that new row to bits 14-32.)
-
Multiple permission values for public key security handlers allow inconsistent MAC token requirement value: Public key security handlers allow different permissions for different recipients. Thus, different recipients may have different bit 13 values in their respective decrypted permission values and, therefore, different opinions on whether or not MACs are required.
Furthermore, in case of SubFilter adbe.pkcs7.s5 recipient lists are specified in the crypt filter dictionaries allowing for different permission bit 13 values for the same recipient. And if StrF and StmF both are Identity, any custom crypt filters don't include permission values, leaving bit 13 undefined.
This could be somewhat mitigated by adding a note to 32004 stressing the need to make the bit 13 value match for all recipients and filters. And another note might be added for cases without any permission value.
Furthermore, I'm unclear about one item: The public-key security handler user access permissions include a bit 2 defined as
When set permits change of encryption and enables all other permissions.
Does this also refer to the bit 13? In other words, Does a set bit 2 in the permissions suppress a requirement for MAC tokens expressed by a cleared bit 13? This also should then be covered by a note in 32004.
Another comment worth resolving (occurred during my review of the dated revision draft in Metanorma):
Table 6, MAC byte-string entry states "The first byte of the allocated region shall be a LESS-THAN SIGN (3Ch), the final byte shall be a GREATER-THAN SIGN (3Eh), and the sequence of bytes in between shall consist of hexadecimal digits representing the DER-encoded PDF MAC token, free of any trailing data." - does this permit PDF whitespace in between the < and > since whitespace is valid in PDF hexadecimal strings?
Good find.
I have never thought about white spaces in hex strings in the context of signatures etc., I don't remember ever having seen a PDF with whitespaces used there. I assume the same goes for the original authors of 32004, so no one considered this.
I think the idea behind the free of any trailing data you quoted is that the hex data shall fit exactly between the '<' and the '>', in contrast to the situation with signatures where explicitly a padding with '00's is described in the spec. Thus, I assume the intention in 32004 in this regard also includes not allowing for any extra whitespace sprinkled in.
This actually makes me wonder whether signature validators have issues with signature hex string containing whitespace characters... ;)
Thus, I assume the intention in 32004 in this regard also includes not allowing for any extra whitespace sprinkled in.
Indeed. The intention was definitely to exclude all extraneous white space.
This actually makes me wonder whether signature validators have issues with signature hex string containing whitespace characters... ;)
That would indeed be interesting to try... :)
I checked a random sample of my PDF corpora and didn't find any Contents byte-strings with whitespace (e.g. that used EOLs to "beautify" lengthy hex into nice lines) but that is not to say there aren't any.
If whitespace is prohibited in MAC entries in ISO TS 32004 (which seems to be the consensus), then I think we also need to update wording in ISO 32000-2:2020 7.3.4.3 Hex strings to note that whitespace is OK unless started otherwise (or something similar along those lines). Do I also assume that the same "no whitespace rule" should be added to the Contents entry in a dig-sig dictionary (Table 255 of 32K)? That would then make just two places across all ISO PDF specs where a "no whitespace in hex strings" rule applies.
I'm fairly sure I have seen a Content with whitespace, although it was probably a handmade file rather than production. It's completely valid as far as I can see and while we're all used to padding that hex string with zeros, if you can get ASN.1 to decode the same PKCS#7 object out of it then other padding values are theoretically possible (theoretically because this may not work in ASN.1 - I haven't checked).
If this is good enough for the Content string, why would it be prohibited for a MAC string for 32004? It's a DER encoding of an integer and two octet strings, I'm pretty rusty on the exact rules for ASN.1 but the length is going to be very predictable to within a few bytes. Again, if it decodes to the same value, what's the problem?
EDIT: struck out incorrect text, PDF spec as quoted two comments down requires padding to be zeros.
@faceless2 I can possibly answer that but not in a public forum 🦹. I suggest we continue that thread on the Crypto TWG email list...
The situation we're trying to prevent is, eg, where you have a ByteRange that preallocated 2000 bytes and your PKCS#7 object encodes to 1000 bytes. The spec is there to prevent you sneaking another PDF object into the 1000 bytes that follows
<...1000 bytes of PKCS7 ... > # this is OK
<...1000 bytes of PKCS7 ... 00000000000000> # this is OK
<...1000 bytes of PKCS7> /Contents <sneaky> # this is not OK
So it has to be one string, but beyond that I don't see what we're trying to protect against.
@petervwyatt as this post is now crossing with your post, I look forward to you me telling me what that is ;-)
@petervwyatt
Do I also assume that the same "no whitespace rule" should be added to the Contents entry in a dig-sig dictionary (Table 255 of 32K)?
I would at least put that up to discussion in the Crypto TWG (or its mailing list you mentioned). On one hand the description of CMS signatures in 32K
For byte range signatures, Contents shall be a hexadecimal string with "<" and ">" delimiters. It shall fit precisely in the space between the ranges specified by ByteRange. Since the length of CMS objects is not entirely predictable, the value of Contents shall be padded with zeros at the end of the string (before the ">" delimiter) before writing the CMS to the allocated space in the PDF file.
could be read to imply that white space characters are not desired, instead extra reserved space shall be filled by padding with zeros. On the other hand this is not explicitly required.
I just wonder, in another comment you said that hex strings can also contain comments:
yes, because hex strings explicitly state "White-space characters (see “Table 1 — White-space characters”) shall be ignored." - and comments are treated as whitespace.
That is something we definitively don't want inside signature Contents or MAC values, this looks too much like allowing the exploits the RUB showed us a few years ago.
Regarding padding: in a digital signature, the exact length of the final ASN.1 payload cannot always be predicted in advance, due to multiple factors. Some examples:
- Externally-sourced timestamps that are applied after signing
- There are several ways to serialise ECC signatures. A popular one (that happens to be the standard one for ECDSA in CMS, if I recall correctly) involves encoding the signature pair
(r, s)as an ASN.1Sequencewith twoIntegerfields. SinceIntegers are variable-width in DER, this also results in some variability in the output.
With MACs none of these issues exist, since for a given parameter choice, you know exactly how big the "hole" in the PDF needs to be to accommodate the payload. (EDIT: ...so there's no reason to allow padding)
EDIT2: At least, that was the rationale behind the requirement in 32004.
The Cryptography & Provenance TWG recommends to
- add a recommendation to use the same value for bit 13 across all permission values in the document in case of public key encryption; and to
- add a requirement not to use public key security handler permission values with both bit 13 set to zero and bit 2 set to 1.