core icon indicating copy to clipboard operation
core copied to clipboard

image API: control order of image operations

Open bertsky opened this issue 4 years ago • 6 comments

Naturally, with operations like deskewing/rotation and cropping the resulting image size and relative coordinates depend on the relative order of the operations. We already respect that by following the exact order of image features in AlternativeImage/@comments when reconstructing coordinates.

But reconstructibility not always enough. One might need to enforce a certain ordering for producing new images.

Consider for example the following realistic workflow:

  1. binarization (binarizes the raw image)
  2. deskewing (rotates 1.)
  3. cropping (crops 2.)
  4. some consumer that needs both a binarized image (from 3.) and a raw image, and naturally expects them to be in the same coordinate system

Here we get a problem: In 4., the request image_from_page(page, feature_filter='binarized') will make core produce a new image by applying cropping and deskewing to the raw image. (This is the default ordering of core.) But having a different ordering, that second image will be larger and in a different coordinate system. So the consumer gets an inconsistent pair of images (which it would have to check and compensate for by looking at the two returned coordinate transforms).

There is no cheap solution here I'm afraid:

  • We cannot just change the default ordering, because other workflows might be using another ordering, too.
  • We cannot just look into the existing derived images and adopt their ordering, because (depending on the workflow) we would not know which of the images is the right one (and they could derive from differently ordered sub-workflows). That is, at the time the second image_from_page request is made, we don't know what was in the other.
  • Even if we allow passing a parameter to make the caller control the ordering, the caller might forget to do that (or not be aware of the necessity, rightly expecting to get the same kind of result automatically)

So what the heck are we gonna do about this?

bertsky avatar May 03 '21 15:05 bertsky

On the other hand – forgive my confusion – since d6c51ca5f40090e2a6ac305b034f124b05f4edd9 we should actually not see any difference, because we already re-crop after rotation whenever possible.

bertsky avatar May 03 '21 17:05 bertsky

On the other hand – forgive my confusion – since d6c51ca we should actually not see any difference, because we already re-crop after rotation whenever possible.

Darn! I think that commit was (well intentioned but) still flawed: when re-cropping via _crop(), which is the general function, it would first check whether cropped already applied on the image (which it is, but on an earlier state, now only half-way), and in that case only apply the corresponding coordinate transform, but not the actual image operation. So we need to bypass that feature check here.

bertsky avatar May 03 '21 17:05 bertsky

I think that commit was (well intentioned but) still flawed: when re-cropping via _crop(), which is the general function, it would first check whether cropped already applied on the image (which it is, but on an earlier state, now only half-way), and in that case only apply the corresponding coordinate transform, but not the actual image operation. So we need to bypass that feature check here.

#688 fixes that (just tested it). So what remains is the possibility for 1px (rounding) differences.

That's still problematic for array operations, though ...

bertsky avatar May 03 '21 17:05 bertsky

@bertsky What is still left to do beyond https://github.com/OCR-D/core/pull/688 and how can I help?

kba avatar Feb 08 '22 18:02 kba

@kba I don't know TBH. It would help if someone else thought it through. (Perhaps we are missing something here.)

bertsky avatar Feb 08 '22 22:02 bertsky

What is still left to do beyond #688 and how can I help? I don't know TBH. It would help if someone else thought it through. (Perhaps we are missing something here.)

@kba Turns out we did miss something!

In this comment I alluded to the possibility of making recropped an implicit feature (subsumed under deskewed). But we did not implement that. ~~(We would have had to remove the elif block in workspace._rotate and suppressed concatenating recropped to the feature list in workspace._crop.)~~

However, to correctly keep it an explicit feature, we should have ensured that recropped is always added to the feature list during workspace._crop (not just for the page level).

Note that the problem is subtle: You can only see this if recropping after rotation actually makes a difference for your coordinates (i.e. when the are not just rectangles and at least one corner gets chopped off by the new minimal bbox). EDIT: And the problem only causes a false warning to be printed – no actual coordinates/images being wrong. (The re-recropped image never gets used in the elif block, only its coordinates.)

So it is still buggy, and we should take the opportunity to reconsider that decision:

  1. On the one hand, making it an implicit feature (an implementation detail or convention of the deskewed feature if you will) – with the above mentioned change (plus some updates to the tests).
  2. On the other, keeping it truly explicit (dragging recropped along, and not enforcing it) – with the other change. But that would actually invalidate all deskewed derived images in existing datasets (because they lack the recropped feature annotation).

I'd tend towards 1.

EDIT: It turns out this is not a bug in the (image/coordinate) data, only a fake warning. We just needed to avoid entering the image op, cf. #820.

The good news is that now we at least have an actual image API test.

bertsky avatar Mar 15 '22 12:03 bertsky