darktable icon indicating copy to clipboard operation
darktable copied to clipboard

AI Masks

Open mamvdwel opened this issue 2 years ago • 13 comments

Is your feature request related to a problem? Please describe. Yes the problem of creating a good mask for complex areas (so when it is very difficult or labour intensive to create one)

Describe the solution you'd like Typing (or choosing from presets) the area you would like to select (e.g. 'sky' or 'trees') An AI should then create a parametric, a drawn or a parametric + drawn mask in the module at hand. This mask can then - if necessary - be fine tuned by the user

Alternatives None

Additional context This would speed up the creation of masks tremendously

mamvdwel avatar Aug 09 '22 12:08 mamvdwel

after the AI is implemented by someone (thats the less complex part of the job), who should do the training? you're aware that for each module in the pipe the training must be done since each input values isn't not identical depending on the preceding module.

MStraeten avatar Aug 09 '22 18:08 MStraeten

Hi Martin,

Thanks for replying to my feature request. With regards to your question(s): I see this as follows: The AI should interpret the image in the state it is in (input state of the module at hand) and then will have to try to find the indicated object in that image. Another option could be that the AI only works with the image at the beginning of the pipe line if this is easier. I'm not an AI expert but for a trained AI it shouldn't matter in which module this is done (an AI that recognizes a cat will find this animal in many images and states of images just like a human being can do this). Of course success is not always guaranteed but that's expected. If the AI manages to find the requested object it can create a mask (on second thought [I implied otherwise in my request] a raster mask [that can be used in other modules] may be the most logical choice with the option for the user to combine this mask with a mask drawn by the user [this would be new functionality as well but would be cool])

As far as training the AI is concerned. Pre trained libraries for object recognition seem to exist (see for instance this page: https://stackabuse.com/object-detection-with-imageai-in-python/[1]) which could be a good starting point. Along with that the AI could improve/be trained further from the drawn mask the user creates to combine with the mask created by the AI). It would be cool if the user can optionally export the locally trained model of the AI and send it to the darktable team so that this can be combined with the contributions of other users to further optimize the AI. This optimized model could then be distributed as an add on to the users again (and/or incorporated in a next version of the product) This way the whole community can contribute to the improvement of the AI

I hope this helps

Best regards

On Tuesday, 9 August 2022 20:04:36 CEST Martin Straeten wrote:

after the AI is implemented by someone, who should do the training? you're aware that for each module in the pipe the training must be done since each input values isn't not identical depending on the preceding module.


[1] https://stackabuse.com/object-detection-with-imageai-in-python/

mamvdwel avatar Aug 09 '22 18:08 mamvdwel

I have been thinking a lot about such a feature recently, since I very successfully used rembg to save me days of work with better result than I ever was able to achieve with regular masks (examples at the end).

I could imagine that such a tool is implemented as a module which is for the pipeline essentially a no-op but provides a raster mask for later modules. It would come early in the pipeline, maybe just before exposure, and bring its own downstripped fix image pipeline to prepare the image for the AI: essentially exposure, and a generic base curve to ensure the image data is in a condition that is similar to the training data of the AI.

As a first step, for the AI itself, the rembg approach seems very handy, and, very important: it is comparatively fast, even for my 30 MP 48 bpp tiff images it typically runs in less than 1 s, but I have no exact timing. Most of it may be reading and writing the tiff files anyway, which would not be required in the darktable case, as rembg would be used as a library anyway and the data is already in memory.

At least it would be a starting point, and improved networks and more controls could be added later.

Examples:

The task was to have headshots of my son's soccer team, in front of neutral background. The light was not ideal, harsh afternoon summer sun, no chance to overpower with the flashes I own. But it was the only possible date, just before the training to not get red faces.

This is the best I was able to achieve in darktable 2 years ago, the GIMP tools for semi-automatic foreground extraction were worse, but even here you can see the yellow cast on the background. This took me min. an hour per photograph, plus the usual editing, and I even do not have the chance for background replacement, I had to accept the white. grafik

With rembg, this year, it took 1 hour for all 20+ players. I know it is blown out as the lighting condition was even worse this time, but this is about the background removal, which is IMO much better. It was a dark background this time, btw. grafik

It is not perfect, but given that it took <1 s and no parameters at all, and that I started this year again to try it by myself and was not able to complete 1 image in several hours (which eventually led to my choice to try rembg), having such a raster mask in darktable would be incredibly useful.

Btw, it also worked very well for dark hair in front of the dark background, and also dark skin color, and the background was by no means flat and even. It only failed to recognize holes in 1 case, where a little hole in between arm and body was not detected properly, in several similar cases it recognized the same hole. With a combined painted mask this would have been very easy to fix, but I needed only the head and shoulders from this image.

spaceChRiS avatar Aug 09 '22 19:08 spaceChRiS

Thanks for your message. I agree that creating a special module at the beginning of he pipeline whereby the created raster mask could be used in successive modules could be a good option and probably makes the most of sense. What is important though is that the user will be allowed to modify the by the AI created raster mask in the AI module by combining it with a drawn mask (in case the mask is not perfect). Your experiences with AI shows in my opinion that this is doable although maybe not easy. It would be a logical next step in improving masks though so lets hope this is going to be approved for a future release!

On Tuesday, 9 August 2022 21:33:38 CEST spaceChRiS wrote:

I have been thinking a lot about such a feature recently, since I very successfully used rembg to save me days of work with better result than I ever was able to achieve with regular masks (examples at the end).

I could imagine that such a tool is implemented as a module which is for the pipeline essentially a no-op but provides a raster mask for later modules. It would come early in the pipeline, maybe just before exposure, and bring its own downstripped fix image pipeline to prepare the image for the AI: essentially exposure, and a generic base curve to ensure the image data is in a condition that is similar to the training data of the AI.

As a first step, for the AI itself, the rembg approach seems very handy, and, very important: it is comparatively fast, even for my 30 MP 48 bpp tiff images it typically runs in less than 1 s, but I have no exact timing. Most of it may be reading and writing the tiff files anyway, which would not be required in the darktable case, as rembg would be used as a library anyway and the data is already in memory.

At least it would be a starting point, and improved networks and more controls could be added later.

Examples:

The task was to have headshots of my son's soccer team, in front of neutral background. The light was not ideal, harsh afternoon summer sun, no chance to overpower with the flashes I own. But it was the only possible date, just before the training to not get red faces.

This is the best I was able to achieve in darktable 2 years ago, the GIMP tools for semi-automatic foreground extraction were worse, but even here you can see the yellow cast on the background. This took me min. an hour per photograph, plus the usual editing, and I even do not have the chance for background replacement, I had to accept the white. grafik

With rembg, this year, it took 1 hour for all 20+ players. I know it is blown out as the lighting condition was even worse this time, but this is about the background removal, which is IMO much better. It was a dark background this time, btw. grafik

It is not perfect, but given that it took <1 s and no parameters at all, and that I started this year again to try it by myself and was not able to complete 1 image in several hours (which eventually led to my choice to try rembg), having such a raster mask in darktable would be incredibly useful.

Btw, it also worked very well for dark hair in front of the dark background, and also dark skin color, and the background was by no means flat and even. It only failed to recognize holes in 1 case, where a little hole in between arm and body was not detected properly, in several similar cases it recognized the same hole. With a combined painted mask this would have been very easy to fix, but I needed only the head and shoulders from this image.

mamvdwel avatar Aug 10 '22 03:08 mamvdwel

I was experimenting with non-AI stuff like segmentation algorithms, they can do some sort of content detection and could be used as a preparation step for a drawn mask ...

jenshannoschwalm avatar Aug 10 '22 04:08 jenshannoschwalm

Or as preparation step for the AI solution … :wink:

Seriously, I am really impressed by what this approach (AI, in particular rembg) can achieve. Of course it fails in some cases as well, but the net time saving you can have with cumbersome tasks is incredible. In particular, the results are sometimes excellent in scenarios where it is hard to believe that a “classical” algorithm would work at all. An example I had is one of the soccer team's members, who has very dark skin and almost black hair, in front of a very dark background, but the AI was able to generate a perfect mask. A classical algorithm based on local features (differences in tone, contrast, color etc.) may have failed miserably.

spaceChRiS avatar Aug 10 '22 06:08 spaceChRiS

the problem of creating a good mask for complex areas

There is already parametric mask and mask option to manage complex areas 🤔

Jiyone avatar Aug 12 '22 21:08 Jiyone

This issue did not get any activity in the past 60 days and will be closed in 365 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

github-actions[bot] avatar Oct 12 '22 00:10 github-actions[bot]

I think the beta launch of Photoshop generative fill has amplified the opportunity for use of AI in darktable (even if via user API credentials to a cloud setvice:

https://www.adobe.com/za/products/photoshop/generative-fill.html

While masking can show skill, this should be secondary to photographic and artistic effort

ga-it avatar Jun 20 '23 21:06 ga-it

This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

github-actions[bot] avatar Aug 21 '23 00:08 github-actions[bot]

I'd like to point to Segment Anything, it's open source and very impressive (you can try it with your own images on their website).

I imagine a workflow like this:

  1. You do some basic editing, in a sense that the brightness and colours are roughly correct.
  2. You press a button labelled "AI Masking".
    1. The image is rendered at medium resolution.
    2. This image is sent to Segment Anything
    3. Now the user can interact with the image like in the demo of Segment Anything to do his masking.
    4. The user clicks on "Finish AI Masking".
    5. A drawn mask is generated from the data from Segment Anything (COCO format) and sent back to Darktable
  3. The user can further refine the drawn mask in Darktable.

Such a solution should not be completely out of reach to be implemented and would drastically improve my workflow.

mfg92 avatar Oct 28 '23 08:10 mfg92

I did give it a try. Yes - impressive!

Although the "send to somewhere and get back a result" workflow doesn't seem good to me. We would depend on that service being provided "for ever". We would prefer to use a git submodule and run locally i think.

jenshannoschwalm avatar Oct 29 '23 08:10 jenshannoschwalm

This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

github-actions[bot] avatar Dec 29 '23 00:12 github-actions[bot]