ImageSharp icon indicating copy to clipboard operation
ImageSharp copied to clipboard

Feature Request: handle Subject* metadata in Resize/Crop/Rotate transforms

Open vpenades opened this issue 7 years ago • 4 comments

Prerequisites

  • [x] I have written a descriptive issue title
  • [x] I have verified that I am running the latest version of ImageSharp
  • [x] I have verified if the problem exist in both DEBUG and RELEASE mode
  • [x] I have searched open and closed issues to ensure it has not already been reported

Description

Right now, when performing transformations that involve changing the pixel dimensions of the image, the Pixel Size metadata is automatically changed to match the new dimensions.

But there's some more metadata tags that hold pixel positional information these are:

  • SubjectLocation ( interpreted as a point within the image )
  • SubjectArea ( interpreted as a point, circle, or rectangle within the image )

My guess is that these tags are designed to mark the "main subject" within an image, if it's an image taken with a camera, it's probably the point or area used by the camera to focus.

So, if the value is properly set, it can be used for speed up operations like finding a face within an image (assuming the face is the main subject)

But if manually set, it can also be used for other purposes, like to define the "center/offset" of the image, or even the circle/rectangle of the "main element" within the image. In my case, I would like to use this tag to define the "Pivot" of a sprite in videogame asset authoring. Having a "pivot" in the image can be useful in many ways, and mybe, at some point in the future, GraphicsOptions could have an Enum to define the reference coordiante system as "TopLeft" or "SubjectPivot" or something like that.

I have a helper class to simplify access to these tags here.

My request is that, when performing operations such as Resize, Crop, Rotate, etc, the Subject* tags are adjusted accordingly, so the area represented by the Subject stays meaningful.

I agree some transforms might be diffucult or impossible to perform, so as fallback solutions I propose:

  • If the transformation is impossible to resolve, remove the tags.
  • If the transformation can be partially solved, fit the result as much as possible, for example, if SubjectArea represents a rectangle, and we rotate the image 45 degrees, we could replace the rectangle with a circle, which is also supported by SubjectArea.

I agree that this is a group of metadata tags that are rarely used (I had a hard time finding official examples, found this, which, btw, has the subject area outside the image due to an image resize) , so maybe an intermediate solution would be to have some sort of callback.

vpenades avatar Feb 09 '18 10:02 vpenades

@vpenades It's been a few years....

Is this something you're still interested in?

JimBobSquarePants avatar Jul 31 '24 14:07 JimBobSquarePants

hi, sorry for late reply, busy as usual...

I haven't touched the topic in a few years, but the issue is fairly straightforward to understand: there's some metadata in images that references the current image in its current dimension, for example the ROI metadata. If the image is manipulated in some way, like resized or cropped, the ROI stored in the metadata is no longer valid.

The only way I could think to fix this is that the metadata contains some kind of interface or listener that is called whenever the image is transformed in some way.

It's the kind of stuff that's easy to say but extremely difficult to do, specially taking into account the heterogeneous nature of metadata.

So I would tag this feature request as a "nice to have, but not worth the development time".

On the other side, I would try to improve the edition and manipulation of metadata as much as possible, because I think it's an underexploited feature of images.

vpenades avatar Aug 11 '24 20:08 vpenades

It's happening....

Image

Image

JimBobSquarePants avatar Jun 12 '25 07:06 JimBobSquarePants

That's great!!

This is relevant to AI because I think images metadata is heavily underused; typically it's relied on external metadata files that can be easily misaligned with the actual content of the image. And most vision models require the image to be resized just before sending it to the model.

vpenades avatar Jun 15 '25 09:06 vpenades