pdf icon indicating copy to clipboard operation
pdf copied to clipboard

feat: support `Annot` type for `FieldDictionary.kids`

Open dlzht opened this issue 10 months ago • 8 comments

ISO 32000-2:2020(E) 12.7.4 Field dictionaries (Table 226 — Entries common to all field dictionaries)

Key Type Value
Kids array (Sometimes required, as described below) An array of indirect references to the immediate children of this field.
In a non-terminal field, the Kids array shall refer to field dictionaries that are immediate descendants of this field. In a terminal field, the Kids array ordinarily shall refer to one or more separate widget annotations that are associated with this field.
However, if there is only one associated widget annotation, and its contents have been merged into the field dictionary, Kids shall be omitted.

Looks like Kids support FieldDictionary and Annot too, however, the type of FieldDictionary.kids is Vec<Ref<FieldDictionary>>.

dlzht avatar Mar 03 '25 11:03 dlzht

Ah yes, this mess. We could add a Union type that parses the dictionary into two parts.

struct Union<T, U>(T, U);
... snip ...
kids: Union<FieldDictionary, Annot>

s3bk avatar Mar 03 '25 16:03 s3bk

Actually it is much worse.

struct Merged<A, B> {
  a: Option<A>,
  b: Option<B>,
}

s3bk avatar Mar 03 '25 16:03 s3bk

@dlzht


pub struct Merged<A, B> {
    pub a: Option<A>,
    pub b: Option<B>,
}
impl<A: FromDict, B: FromDict> FromDict for Merged<A, B> {
    fn from_dict(dict: Dictionary, resolve: &impl Resolve) -> Result<Self> {
        let a = A::from_dict(dict.clone(), resolve).ok();
        let b = B::from_dict(dict, resolve).ok();
        Ok(Merged { a, b })
    }
}
impl<A: FromDict+Send+Sync+'static, B: FromDict+Send+Sync+'static> Object for Merged<A, B> {
    fn from_primitive(p: Primitive, resolve: &impl Resolve) -> Result<Self> {
        let dict = p.resolve(resolve)?.into_dictionary()?;
        Self::from_dict(dict, resolve)
    }
}

impl<A: ToDict, B: ToDict> ToDict for Merged<A, B> {
    fn to_dict(&self, update: &mut impl Updater) -> Result<Dictionary> {
        let a = self.a.as_ref().map(|a| a.to_dict(update)).transpose()?.unwrap_or_default();
        let b = self.b.as_ref().map(|b| b.to_dict(update)).transpose()?.unwrap_or_default();
        let mut out = a;
        out.append(b);
        Ok(out)
    }
}
impl<A: ToDict, B: ToDict> ObjectWrite for Merged<A, B> {
    fn to_primitive(&self, update: &mut impl Updater) -> Result<Primitive> {
        self.to_dict(update).map(Primitive::Dictionary)
    }
}

please try if that works for you

s3bk avatar Mar 03 '25 20:03 s3bk

Yes, it works for me, thank you.

In addition, I have another confusion to consult, Annot has field Annot.rect, it is undoubtedly as described in ISO 32000-2:2020 12.5.2). But FieldDictionary has field FieldDictionary.rect too, I can not find this rect in ISO 32000-2:2020 12.7.4.

My guess: FieldDictionary.rect is intruduced to deal with if there is only one associated widget annotation, and its contents have been merged into the field dictionary. And if we add type Merged<A, B>, this field will become redundancy. Is my guess correct?

dlzht avatar Mar 04 '25 04:03 dlzht

I am pretty sure that is from my confusion when I discovered that the same dictionary contains two different objects. Just delete it from the one that isn't supposed to have it.

s3bk avatar Mar 04 '25 04:03 s3bk

Yeah, it real confuses me when I first known, that different PDF type can be merged into same dictionary, and this will bring a lot of inconvenience to programming.

In addition, I have a suggestion, we can separate types(pdf/src/object/type) into different .rs files, gathered in directory type, it will be more clearer . If you also think this is acceptable, I can do this work and submit a PR.

dlzht avatar Mar 04 '25 04:03 dlzht

Yes, I will separate the types. I don't want a PR because that is a lot of changed lines I have to check.

s3bk avatar Mar 04 '25 04:03 s3bk

There will indeed be a lot of work to be done, thank for your great work.

dlzht avatar Mar 04 '25 04:03 dlzht