TPPDF Watermarks for Images

Hi,

I've posed this as a question on Stack Overflow but it may be better suited as a feature request.

I've tried two approaches to add an image or attributed string watermark on top of an image in a TPPDF document:

Affix the watermark to the image with core graphics before adding the image to the document; ~~this isn't great as the watermark is subject to resizing and quality changes during PDF generation.~~ I've submitted #232, which demonstrates a possible solution with no resizing/quality issues.
Position the watermark over the image in the document. This seems like the right thing to do per #19 but the API seems to have changed. I've attempted to find an unobtrusive way to expose image offsets and set watermark offsets such that the watermark can be correctly positioned over the image. I haven't had much success so far.

Could you point me in the right direction or let me know what you think? I'd be very happy to contribute to this solution.

Sep 10 '20 05:09 chrisgonzgonz

Hi there, sorry I haven't responded earlier, busy times and TPPDF is a side project for me.

I like the idea of rendering hooks, so the framework stays light but you can adapt it if necessary, as I do not believe TPPDF should include too much image processing (hardly optimized an way better frameworks available).

About the closure pattern:

Do you think this pattern is suited the best?
Should we consider the delegation-pattern?
How about a delegate for the whole PDFGenerator instead of a per-object level.

Would like to hear your thoughts.

Sep 23 '20 08:09 philprime

No problem @philprime , thanks again for your work!

- Do you think this pattern is suited the best? Should we consider the delegation-pattern? The closure made the most sense to me because it ties one rendering hook to a specific image. If more rendering hooks are to be added, maybe a delegate likePDFImageDelegate would make more sense?

- How about a delegate for the whole PDFGenerator instead of a per-object level. With a delegate on PDFGenerator, to apply individual treatment to multiple images we would need to give the images identifiers and centralize the logic. This seems like a little more overhead but would make sense if:

We favor centralizing the logic over setting it on each PDFImage.
We want to support many rendering hooks.

Please let me know if you had something else in mind, especially if I'm missing something with the PDFGenerator delegate idea. I'm happy to dig in some more.

Sep 23 '20 10:09 chrisgonzgonz

The main idea of the PDFDocument is to be a tree-like structure holding only information to generate a PDF based on it. About one or two years ago I decided to move all logic from each PDF Element into the internal objects, so that a PDFDocument is not mutated during rendering (therefore instances of PDFDocument can be reused). This idea is applied almost everywhere (I guess PDFSections do not use it correctly).

Now we could add closure-based logic to the document tree, after all they supply additional information to the document tree. But I see a few flaws:

What if multiple images should be processed in the same way? Yes, I could pass all the same lambda functions to all of the images.
Equality can not be determined anymore. As per definition closures can have arbitary logic and therefore cannot be compared (unfortunately I can not find the link to this information anymore, but I battled with this issue a while ago).
As you stated: Additional rendering hooks might be requested at some point, and therefore having a unified solution from the beginning makes life easier for us.

About your input with the image identifiers: We actually do not need them! PDFImage is a class, therefore we can keep a reference during the PDFDocument build an our delegate might look like this:

protocol PDFGeneratorImageDelegate: class {
    func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect)
}

typealias PDFGeneratorDelegate = PDFGeneratorImageDelegate // for extensibility

and an example implementation:

let generator = PDFGenerator()
generator.delegate = myCustomDelegate

let document = PDFDocument(format: .a4)
[...]
let watermarkImage = PDFImage(...)
document.add(image: watermarkImage)

[...]
// in the delegate class

func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect) {
    if image === watermarkImage {
        ....
    }
}

Sep 24 '20 06:09 philprime

I see what you mean now by delegating from PDFGenerator. Great point about the identifiers!

I like this approach; I agree that it is cleaner and more extensible.

I'm happy to put together a PR unless you have it sorted out with your example implementation.

Sep 24 '20 09:09 chrisgonzgonz

Looking forward to a PR, I'll give feedback during review.

Sep 25 '20 07:09 philprime

#241 implements PDFImageDelegate as discussed. Please let me know what you think.

Some considerations that hadn’t occurred to me before:

Example use case: a watermarked timestamp unique to each image Closures are nice for capturing data that doesn’t belong in the PDFImage for image processing later. This data is usually closest at hand when we’re creating the PDFImage in the first place. This isn’t a big deal, one work around when using the delegate pattern is to cache the PDFImage and extra data in a dictionary with the PDFImage as a key. To make this possible, I made PDFImage Hashable.

Edge Case: Not a big deal, but there is an edge case where we are adding a PDFImage with the same values in different places of the document and we may want to apply different processing for each. In this case, a tag like we previously discussed could be useful.

Oct 12 '20 07:10 chrisgonzgonz

Thanks a lot for your contribution, I am always glad when others help to evolve the framework.

I will give you some code feedback directly in the PR #241

About the example use case:

Adding Hashable to the PDFImage is a nice addition, maybe we can even write some unit tests for it (the general test coverage is in poor shape and needs to be improved anyway)?

About the Edge case:

On the one hand I can see some kind of identifier, tag or similar for uniquely identifying the elements but keeping the memory usage low by reusing the elements. On the other hand when adding a tag, it would have to be set either by the internal objects, which are not available for the framework user anyway, or we would have to duplicate the PDFImage and set the different tag, leading to another instance again. Maybe the easiest way would be creating a Factory-Pattern which creates the PDFImage with the correct settings, returning a fresh instance on every call, in your application, and storing the references in different properties.

Correct me if I am wrong

Oct 12 '20 18:10 philprime

I'm happy to help, I really appreciate the framework and your work here! I'll add test coverage to the PR.

Edge Case: So if I understand correctly, the options are:

Each PDFImage created is a separate instance. This is how PDFImage is currently implemented so the only change would be adding an optional identifier instance variable to PDFImage. This could be done automatically and exposed throught the identifier instance variable (i.e. tag each new instance with a UUID) or let a user optionally set this on or after creation.
We reuse instances of PDFImage? If I'm understanding correctly this would be like a UITableViewCell, where we could recycle this PDFImage using a datasource pattern.

Am I correct in thinking the factory pattern would be used for option 1 and would basically serve as a convenience initializer for instances that will make use of identifiers, as below? Or are you thinking of making use of PDFImage subclasses.

PDFImage.init(image: Image,
                caption: PDFText? = nil,
                size: CGSize = .zero,
                sizeFit: PDFImageSizeFit = .widthHeight,
                quality: CGFloat = 0.85,
                options: PDFImageOptions = [.resize, .compress],
                cornerRadius: CGFloat? = nil,
                identifier: String? = nil)

Oct 13 '20 21:10 chrisgonzgonz

To quote your previous request:

[...] edge case where we are adding a PDFImage with the same values in different places of the document and we may want to apply* different processing for each* [...]

I'll try to illustrate, how I am understanding it, with an example:

let document = PDFDocument(...)
let image1 = PDFImage(...) 
document.add(image: image1)
// add more stuff to the document
document.add(image: image1)
let generator = PDFGenerator()
generator.render(...)

// in the delegate class
func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect) {
    if image === image1 {
        ....
    }
}

We can not differ between the first added image1 and the secondly added image1 as they are the same instance in memory. Now if we do something like this:

let document = PDFDocument(...)
let image1 = PDFImage(...) 
image1.id = "image-1"
document.add(image: image1)
// add more stuff to the document
image1.id = "image-2"
document.add(image: image1)
let generator = PDFGenerator()
generator.render(...)

// in the delegate class
func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect) {
    if image == "image-1" {
        ...
    } else if image == "image-2" {
        ...
    }
}

it would seem like setting different identifiers would let us differentiate, but acutally this approach won't work as image1 is a class instance, therefore setting image1.id = "image-2" will overwrite the image1.id = "image-1". Keep in mind, our document tree is a simple data-tree which won't be processed until using it with PDFGenerator.

About the internal logic, this is an excerpt from PDFDocument.swift:108:

public func add(_ container: PDFContainer = PDFContainer.contentLeft, image: PDFImage) {
        objects += [(container, PDFImageObject(image: image))]
}

So when adding a PDFImage to the document tree, it wraps it in an PDFImageObject. Now we could set the unique identifier on the PDFImageObject, but as I have mentioned before, this is an internal class, and shall not be seen by the user. Now we could do something like this:

func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect, objectId: String) {
    if objectId == "image-1" {
        ...
    } else if objectId == "image-2" {
        ...
    }
}

but we would have to change all methods in PDFDocument to add methods like these:

public func add(_ container: PDFContainer = PDFContainer.contentLeft, image: PDFImage, withId objectId: String = UUID().uuidString) {
        objects += [(container, PDFImageObject(objectId: objectId, image: image))]
}

This does not seem reasonable to me, in comparsion to the following:

class MyImageFactory {

    static func create(with image: UIImage) -> PDFImage {
        return PDFImage.init(image: image,
                                           caption: PDFText? = nil,
                                           size: CGSize = .zero,
                                           sizeFit: PDFImageSizeFit = .widthHeight,
                                           quality: CGFloat = 0.85,
                                           options: PDFImageOptions = [.resize, .compress],
                                           cornerRadius: CGFloat? = nil,
                                           identifier: String? = nil)
    }
}

// Assetse
let image = UIImage(...)

// Document Creation
let document = PDFDocument(...)
let image1 = MyImageFactory.create(with: image)
document.add(image: image1)
// add more stuff to the document
let image2 = MyImageFactory.create(with: image)
document.add(image: image2)
let generator = PDFGenerator()
generator.render(...)

// in the delegate class
func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect) {
    if image === image1 {
        ...
    } else if image === image2 {
        ...
    }
}

This might not be memory efficient as we create multiple PDFImage instances with the same settings/content, but we can easily differentiate them by their memory address, rather than with adding a whole identifier management.

Maybe I should implement an assertion mechanism if one adds the same element twice to the document tree. This has been mentioned in another issue before.

So to answer 2.:

No, not like UITableViewCell.

I hope this clarifies my reasoning more. If I oversee something, or misunderstood, please let me know

Oct 14 '20 09:10 philprime

Ok great, thank you for that illustration. That is very helpful.

I agree, I think it would be best to not alter PDFImageObject as I think this can be worked with existing public APIs.

This looks good to me, but the design issue I stumble on here is this:

How would you store image1 and image2 for access when the delegate method is called? To perform these delegate equality checks, we need the exact instance (as noted by your check for identical equality===). We wouldn't be able to distinguish between image1 and image2 as keys in a dictionary because they would be equal according to our hash function.

This is a common use case with the UITableViewDelegate.cellForRowAtIndexPath- some sort of data structure, temporary or otherwise where information can be stored relevant to that image that can be called by a relevant key. I know we are not using the same pattern for UITableViewCell here but it follows a similar sequence - a callback when a cell or image is being created to represent that which is stored in this data structure. I guess a better example would be assigning different tags to each UITextField administered by the same UITextFieldDelegate.

Also- Is the benefit of MyImageFactory that we get identical initialization settings without having to repeat this multiple times?

Instead of the below, which would also accomplish the same result:

let image1 = PDFImage.init(image: image,
                                           caption: PDFText? = nil,
                                           size: CGSize = .zero,
                                           sizeFit: PDFImageSizeFit = .widthHeight,
                                           quality: CGFloat = 0.85,
                                           options: PDFImageOptions = [.resize, .compress],
                                           cornerRadius: CGFloat? = nil,
                                           identifier: String? = nil)
document.add(image: image1)

// add more stuff to the document
let image2 = PDFImage.init(image: image,
                                           caption: PDFText? = nil,
                                           size: CGSize = .zero,
                                           sizeFit: PDFImageSizeFit = .widthHeight,
                                           quality: CGFloat = 0.85,
                                           options: PDFImageOptions = [.resize, .compress],
                                           cornerRadius: CGFloat? = nil,
                                           identifier: String? = nil)
document.add(image: image2)

Oct 16 '20 01:10 chrisgonzgonz

I just created a hashing example and it confirms your assumption: saving additional data in a dictionary using the PDFImage as the key won't work (unfortunately) as the hash returns the same value

class Foo: Equatable, Hashable {
    var bar: String

    init(bar: String) {
        self.bar = bar
    }

    static func == (lhs: Foo, rhs: Foo) -> Bool {
        lhs.bar == rhs.bar
    }

    func hash(into hasher: inout Hasher) {
        hasher.combine(bar)
    }
}

let foo1 = Foo(bar: "123")
let foo2 = Foo(bar: "123")

var hasher1 = Hasher()
var hasher2 = Hasher()

foo1.hash(into: &hasher1)
foo2.hash(into: &hasher2)

print(hasher1.finalize())
print(hasher2.finalize())

var dict: [Foo: String] = [:]

dict[foo1] = "Foo1"
print(dict) // prints --> [main.Foo: "Foo1"]
dict[foo2] = "Foo2"
print(dict) // prints --> [main.Foo: "Foo2"]

The Factory pattern is just something I wanted to recommend for creating the same object multiple times. It is actually kind of out of scope, because yes, yo ucould also just copy-paste it, or move it to some function. The important part, is that it is not the same object --> different addresses in the memory

About your UITableView example: the datasource method looks up the object using the given IndexPath for looking up the associated data. Basically you create an array of objects, and using the indexPath.section you directly get the object from the array.

I do have two options to propose:

The users application wraps the PDFImage on their own, with something like this:

class IdentifiablePDFImage: Equatable, Hashable {
    var id = UUID().uuidString
    var image: PDFImage

    init(image: PDFImage) {
        self.image = image
    }

    func hash(into hasher: inout Hasher) {
        hasher.combine(id)
        hasher.combine(image)
    }
}

let image1 = IdentifiablePDFImage(image: ...)
let image2 = IdentifiablePDFImage(image: ...)

let data = [
    image1: "more data here",
    image2: "different data here"
]

// in the delegate class
func generator(willBeginDrawingImage image: PDFImage, with context: CGContext, in frame: CGRect) {
    let key = x.keys.first(where: { $0.image === image })
    let additionalData = data[key]
}

Add a tag/identifier to the PDFImage superclass PDFDocumentObject so it is possible to identify the object in the delegation method. UIKit uses a integer tag, but maybe a string be suitable too.

After realizing that the same memory address suffices for === identity check, but not for hashtables/dictionaries, I would prefer 2. too. But instead of letting the framework set an id, let the user set it from the outside, as PDFDocumentObject is a public API.

I created a branch 231-generator-delegate. I added the tag interface and propagated it through the framework. Can you please rebase your progress onto my branch? Also because I derived it from develop. Before you do so, please review your Pull Request #241 as I reviewed it a while ago with comments

Dec 04 '20 19:12 philprime

I found some issues with 231-generator-delegate I will have to fix first, will let you know soon

Dec 05 '20 18:12 philprime

Alright fixed the branch, either base the changes on that one or on develop

Dec 05 '20 21:12 philprime

Thanks @philprime , will do!

Dec 06 '20 19:12 chrisgonzgonz

Hi @chrisgonzgonz time flies and it's already been a year 😄 Are you still working on this issue and the PR you opened up?

Dec 07 '21 08:12 philprime

Hi @philprime , it certainly has 😄

I'd love to get this issue finished and merged. I will update the PR.

Dec 07 '21 10:12 chrisgonzgonz

And another year has passed. 😄

To make sure your work is not lost, is it ok if I push your changes to this main repo and take over the PR?

Dec 03 '22 15:12 philprime

Sure @philprime , thank you

Dec 04 '22 11:12 chrisgonzgonz