pdf-issues icon indicating copy to clipboard operation
pdf-issues copied to clipboard

Radio button form field confusion when using Radiosinunison and Opt array

Open Lonzak opened this issue 3 years ago • 15 comments

There is a discussion with the LateX guys about the interpretation of the PDF standard regarding radio button form fields. I would highly appreciate if the PDF community could shed some light on the matter.

According to the standard the Opt array is defined as:

An array containing one entry for each widget annotation in the Kids array of the radio button or check box field. [...] When this entry is present, the names used to represent the on state in the AP dictionary of each annotation may use numerical position (starting with 0) of the annotation in the Kids array, encoded as a name object (for example: /0, /1). This allows distinguishing between the annotations even if two or more of them have the same value in the Opt array.

This basically means a 1:1 mapping (for me):

/Kids[ 2 0 R  3 0 R   4 0 R  5 0 R  6 0 R  7 0 R ] 
         |      |       |      |      |      |
/Opt [  (A)    (B)     (C)    (A)    (B)    (C) ]
         |      |       |      |      |      |
N-App[  /0     /1      /2     /3     /4     /5 ]      

The RadiosInUnison flag is defined as:

If set, a group of radio buttons within a radio button field that use the same value for the on state will turn on and off in unison; that is if one is checked, they are all checked. If clear, the buttons are mutually exclusive [...].

But how does both play together?

*crobat generates the following structure:

1 0 obj
<</FT/Btn/Ff 33587200/Kids[2 0 R 3 0 R 4 0 R 5 0 R 6 0 R 7 0 R]/Opt[(A)(B)(C)(A)(B)(C)]/T(radio)>>
endobj
2 0 obj
<</AP <</N <</0 19 0 R/Off 20 0 R>>>>...omitted...>>
endobj
3 0 obj
<</AP <</N <</1 34 0 R/Off 35 0 R>>>>...omitted...>>
endobj
4 0 obj
<</AP <</N <</2 43 0 R/Off 44 0 R>>>>...omitted...>>
endobj
5 0 obj
<</AP <</N <</0 53 0 R/Off 54 0 R>>>>...omitted...>>
endobj
6 0 obj
<</AP <</N <</1 67 0 R/Off 68 0 R>>>>...omitted...>>
endobj
7 0 obj
<</AP <</N <</2 81 0 R/Off 82 0 R>>>>...omitted...>>
endobj

Observations: Until now I considered the numerical position (=index) only a tool to resolve the actual value for the on state (which is then located in the Opt array). However the usage of /0 /1 /2 /0 /1 /2 instead of /0 /1 /2 /3 /4 /5 indicates that this is not the case and when the Radiosinunison is set, the index turns into the name for the on state which must be the same for same-checked widgets. But this raises another set of questions:

  1. What is the use of the Opt array in that case if it is not a 1:1 mapping? Then /4 /5 /6 could also be skipped and defined like this /Opt[(A)(B)(C)] (argument from the LateX guys)? But when I changed the child normal appearance names from /0 /1 /2 /0 /1 /2 to /0 /1 /2 /3 /4 /5 the Radiosinunison behaviour stopped working in the viewers I tried.
  2. But if only 3 elements are specified what about the export values of the other elements?
  3. When omitting the Radiosinunison flag the document still shows the Radiosinunison behaviour. Ok this might be an implementation specific thing, however what should win in this situation?

Lonzak avatar Mar 24 '22 22:03 Lonzak

I changed the latex implementation and now create a full Opt array /Opt[(A)(B)(C)(A)(B)(C)] as adobe seem to like that better. But like @Lonzak I don't understand how the Opt array and the Radiosinunison relate. I get buttons in unison if I use /0 /1 /2 /0 /1 /2 in the appearances regardless of the state of the flag.

u-fischer avatar Apr 08 '22 14:04 u-fischer

Agreed for *dobe it also works without the flag. But this is imho implementation specific for their product. The actual ISO PDF specification is quite clear on this matter:

If set, a group of radio buttons within a radio button field that use the same value for the on state will turn on and off in unison; that is if one is checked, they are all checked. If clear, the buttons are mutually exclusive [...].

@u-fischer I would really set the flag for the inunison behaviour even if in *dobe products it works without the flag. I mean setting the flag doesn't hurt, complies with the PDF standard and increases compatibility with other PDF viewers.

Lonzak avatar Apr 11 '22 09:04 Lonzak

I would really set the flag for the inunison behaviour

That is not something I control, it is up to the writer of the button which flags they set.

u-fischer avatar Apr 11 '22 10:04 u-fischer

Ah ok - i am not too familiar with the latex syntax regarding form fields. I thought there might be some automation possible (e.g. when more than one element is selected then automatically add the inunision flag)

Lonzak avatar Apr 11 '22 13:04 Lonzak

What do you mean by "selected"? LaTeX is not a gui, the form fields are created with code. E.g. https://github.com/latex3/pdfresources/blob/testradio/experiments/radiobuttons.tex

u-fischer avatar Apr 11 '22 13:04 u-fischer

That is clear - I meant selected via "code". If you look at the resulting PDF you can see that the hippo called "Sieglinde" is always selected. I would guess that in the radiobutton.tex file the value: "default" is responsible for this. Is the "default" value the only way to (technically) check a radiobutton? Or does this add the /DV dictionary? What would happend if the default is specified for several values or no radio button is checked?

Update: I tried to compile the radiobutton.tex myself however failed (undefined control sequence) so I can not try it myself. What would happen if for group (C) you do not specify "default" so that nothing is selected initially? I bet then it would be a correct radiosinunsion behavior also in *dobe. But since for the 3rd group (C) the initial widget is set incorrectly the viewers do not "repair" it so to speak.

What do I mean? (Shortened) Original ( 111 obj AS=1 and 122 obj AS=0):

104 0 obj
<< /T (C) /FT /Btn /Opt 103 0 R /V /1 /DV /1 /Kids [ 108 0 R 111 0 R 114 0 R 118 0 R 122 0 R 126 0 R ] /Ff 33587200 >>
endobj

111 0 obj
<</Type /Annot/Subtype /Widget /AS /1 /F 0 /Parent 104 0 R /AP 110 0 R /Rect [...]>>
endobj

122 0 obj
<</Type /Annot/Subtype /Widget /AS /Off /F 0 /Parent 104 0 R /AP 121 0 R /Rect [...]>>
endobj

But must be either (both unchecked: 111 obj=/Off and 122 obj=Off):

104 0 obj
<< /T (C) /FT /Btn /Opt 103 0 R /V /Off /DV /Off /Kids [ 108 0 R 111 0 R 114 0 R 118 0 R 122 0 R 126 0 R ] /Ff 33587200 >>
endobj

111 0 obj
<</Type /Annot/Subtype /Widget /AS /Off /F 0 /Parent 104 0 R /AP 110 0 R /Rect [...]>>
endobj

122 0 obj
<</Type /Annot/Subtype /Widget /AS /Off /F 0 /Parent 104 0 R /AP 121 0 R /Rect [...]>>
endobj

OR (both checked: 111 obj=/1 and 122 obj=/1):

104 0 obj
<< /T (C) /FT /Btn /Opt 103 0 R /V /1 /DV /1 /Kids [ 108 0 R 111 0 R 114 0 R 118 0 R 122 0 R 126 0 R ] /Ff 33587200 >>
endobj

111 0 obj
<</Type /Annot/Subtype /Widget /AS /1 /F 0 /Parent 104 0 R /AP 110 0 R /Rect [...]>>
endobj

122 0 obj
<</Type /Annot/Subtype /Widget /AS /1 /F 0 /Parent 104 0 R /AP 121 0 R /Rect [...]>>
endobj

Lonzak avatar Apr 11 '22 13:04 Lonzak

@lrosenthol - could you please propose improved wording/clarifications for this and related errata?

petervwyatt avatar Apr 27 '24 02:04 petervwyatt

Suggestions :-)

  1. Opt array: "If an Opt array is used it must have the same length as the Kids array."

  2. Opt array: "[...] When this entry is present, the names used to represent the on state in the AP dictionary of each annotation must use numerical positioning (starting with 0) [...]"

  3. If RadiosInUnison is not set, each radio button behaves independently, allowing only one button within the same field to be selected at any time, even if buttons share the same name in the AP dictionary.

...

Lonzak avatar Apr 29 '24 07:04 Lonzak

First, it is worth looking at the history here...

From the Implementation Notes in the old PDF Spec

The behavior of Acrobat has changed in the situation where a check box or radio button field have multiple children that have the same export value. In Acrobat 4.0, such buttons always turned off and on in unison. In Acrobat 5.0, the behavior of radio buttons was changed to mimic HTML so that turning on a radio button always turned off its siblings regardless of export value. In Acrobat 6.0, the RadiosInUnison flag allows the document author to choose between these behaviors.

And here is the text from the description of /Opt in PDF 1.5, where it RadiosInUnison was introduced

When this entry is present, the names used to represent the on state in the AP dictionary of each annotation are computer-generated numbers equivalent to the numerical position (starting with 0) of the annotation in the Kids array. This allows distinguishing between the annotations even if two or more of them have the same value in the Opt array. For example, two radio buttons may have the same on state, but if the RadiosInUnison flag is not set, only one of them at a time can be checked by the user.

And here is some additional useful info about Opt's history

Opt array of text (Optional)This key is used when the value of the field cannot be represented in PDFDoc encoding. There are as many entries in the array as there are widget annotations in one radio button or checkbox field (i.e. they all have the same name). The elements in Opt are text strings, where the ith string represents the on state of the ith widget annotation. When the Opt array is present, the names used to represent the “on” state in each AP dictionary are computer generated numbers (starting with “0” for the first annotation of the field). Opt is also used when two or more annotations have the same “on” value. In this case, Opt will include two or more entries that are equal. Since each annotation has a different computer-generated number to represent its “on” state, this allows checking a single annotation at a time while permitting duplicate on states.

All that said - I don't see the problem...

lrosenthol avatar May 07 '24 04:05 lrosenthol

Thank you Leonard for the clarification. This answers two of my three questions:

  1. "When the Opt array is present, the names used to represent the “on” state in each AP dictionary are computer generated numbers (starting with “0” for the first annotation of the field)."

If the /Opt array is present the names for the on state in the AP dictionaries must be numerical positions. =>But parts of the text you mentioned are not part of the current standard.

  1. "There are as many entries in the array as there are widget annotations in one radio button or checkbox field" and "An array containing one entry for each widget annotation in the Kids array of the radio button or check box field."

Both arrays must have the same length. => /Kids.length=/Opt.length =>But parts of the text you mentioned are not part of the current standard.

  1. However my 3rd question isn't fully answered:

"For example, two radio buttons may have the same on state, however the RadiosInUnison flag is not set, only one of them at a time can be checked by the user."

Observations: Acrobat (and also the latex implementation) is not compliant with that statement. E.g.

/Opt[ (A) (B) (C) (A) (B) (C) ]
       |   |   |   |   |   |
      /0  /1  /2  /0  /1  /2 

In this case always two radios are checked and the state of the RadiosInUnison flag doesn't matter (can be set or not). I tried /0 /1 /2 /3 /4 /5 the Radiosinunison behaviour stopped working in the viewers I tried. I would have thought that the numbering must be sequential...

Lonzak avatar May 07 '24 16:05 Lonzak

I think Table 230 in ISO 32000-2 could be updated to address @Lonzak points 1 and 2 in the previous comment by borrowing some words from earlier PDF specs (subject to ISO-ese).

I will leave @Lonzak point 3 as implementation issues to be discussed elsewhere as I think the PDF archeology @lrosenthol provided is pretty clear as to what is/was intended.

petervwyatt avatar May 11 '24 08:05 petervwyatt

@lrosenthol:

All that said - I don't see the problem...

Well I do think that RadiosInUnison is not relevant and so ignored if an Opt array is used. With an Opt array you force or avoid "in unison" by using the same or a different index as value. This is also as remarked by @Lonzak how adobe behaves: in fields with Opt array the RadiosInUnison flag doesn't matter (can be set or not).

But the wording of the spec is not clear here, everytime I read it I wonder again if it actually says that or not.

u-fischer avatar Jun 11 '24 14:06 u-fischer

Wording in ISO 32000-2:2020 seems very clear albeit it is in 12.7.5.2.3 Check boxes above Table 230: "For radio buttons, the same behaviour shall occur only if the RadiosInUnison flag is set. If it is not set, at most one radio button in a field shall be set at a time.", even if this is different to a popular implementation.

The entire block of paragraphs and bullets below Example 1 through to Table 230 applies equally to both Checkboxes and Radio Buttons, but is confusingly in the subclause titled "Check boxes". The last para in 12.7.5.2.4 Radio buttons is the only place that Opt is mentioned, but only insofar as Unicode and the RadiosInUnison is only mentioned in an informative NOTE after the first para. This NOTE should probably also not be a note...

So I think things could be rearranged far more logically by introducing a new section (say 12.7.5.2.5) titled something like "Check box and radio buttons" and moving the common paras from 12.7.5.2.3 Check boxes to there, and adding forward references from both 12.7.5.2.3 Check boxes and 12.7.5.2.4 Radio buttons to the common wording and the moved Table 230. (I am against duplicating wording in these sections).

Addressing the additional points suggested by @Lonzak in this comment:

  1. Yes, we should state that Opt and the related Kids arrays must be the same length as this clearly aligns with @lrosenthol's historical wording.

  2. I think this is an example where ISO-ese has lost an historic requirement, probably because processor requirements tend to be avoided: @lrosenthol discovered this "When the Opt array is present, the names used to represent the “on” state in each AP dictionary are computer generated numbers (starting with “0” for the first annotation of the field)." which is now a permissive "may" whereas this old wording is effectively a shall. So yes to 2 also as a file format requirement.

  3. I think if all the above are done this change is unnecessary. Its more of a note anyway...

The spec also doesn't specify how the Opt text string needs to match the AP state name or converted to an integer. Clearly this should be done in the appropriate canonical form after all escaping (as per J.3.3 and J.3.4) before being compared. I also wonder if these shouldn't be either ASCII strings or only PDFDocEndoded text strings since they could include Unicode text string BoMs and BCP-47 codes (yuck!)!

petervwyatt avatar Jun 12 '24 08:06 petervwyatt

Wording in ISO 32000-2:2020 seems very clear albeit it is in 12.7.5.2.3

Well personally I never found it clear, as I don't know what "the same behaviour" exactly refers too, The last bullet point of list before speak about boxes that should be checked independently while the example below speaks about boxes that should be checked in unison and it unclear if the remark about radio fields relates to the first or the second and if speaks about field with or without Opt array.

even if this is different to a popular implementation.

Well the spec "inherited" RadioInUnison from the Adobe implementation and its history, and describing it so that is in line with this implementation sounds ok to me (and much easier to do than to try to change the implementation, or to try to explain why the implementation doesn't do what the spec says).

u-fischer avatar Jun 12 '24 10:06 u-fischer

@petervwyatt Your suggestions with the restructuring (with the paragraphs and the new section) and the clarifications sound good and would be a meaningful improvement of the spec. I do not know the ISO process for this but who would submit these changes as an official change/improvement? (Just want to ensure that we don't all assume "the other person is doing it" ....

The spec also doesn't specify how the Opt text string needs to match the AP state name or converted to an integer.

Interesting, you mean the "AS (appearance state)"? Currently it is defined as a name object (only). But it would be easier to use the index as well, to allow a direct and unique selection of the correct appearance stream... (And that is what the current implementations do)

Lonzak avatar Jul 11 '24 09:07 Lonzak