Advice on creating a simple SegmentAnything2 integration
I love to use darktable for editing my photos but my main annoyance is that I still need to hand-draw masks for portraits and this can get to a lot of masks quickly....
Therefore I thought I'd give it a shot to see if I could use the output of Segment Anything 2 in Darktable to save me from manually making all these masks! A screenshot of the ui that i whipped up:
r/DarkTable - Working on a simple Segment Anything 2 integration for Darktable, looking for input My current approach is:
Select some points in the image and let Segment Anything do it's magic
Convert the points to the format that the Path mask uses (I'd change the `darktable:mask_points)
<rdf:li
darktable:mask_num="11"
darktable:mask_id="1736033909"
darktable:mask_type="2"
darktable:mask_name="path #2"
darktable:mask_version="6"
darktable:mask_points="gz03eJzL+hFttylZwj4Ljea6rmwDwowMDAwqr2PsEgX47NFpZDVHVyTaxYr+sUOnkdUAAAAHJBY="
darktable:mask_nb="3"
darktable:mask_src="0000000000000000"/>
</rdf:Seq>
- Reopen darktable to reload the xmp file with the newly added masks
Some problems / thoughts that I currently have:
-
I'm currently facing some issues when writing back the points to the file. Reading and editing points (translate a mask) is currently no problem for my code but when I replace the mask's path points with the output of the masking code I get masks in very weird shapes
-
It does not really feel right first extract the outline from the mask and use that as a mask. Using the mask defined on the pixels would be a lot nicer, is there any way that I can get a custom image into the intermediate masks in the rendering pipeline?
Some input from darktable devs who know a lot better then me how this all works would be really appreciated! I've also been looking if it would be nice to convert this into a darktable plugin when it's working reliably but I've had a a lot of trouble finding good resources...
I've also read through the LUA API docs but didn't find any functionality there to alter masks. Maybe there is a way to accomplish this that I didn't find in the docs??
Here is the code: https://github.com/kalmjasper/segmentanything_darktable
- is there any way that I can get a custom image into the intermediate masks in the rendering pipeline?
Well, no.
Unless you like really dirty hacks...
You can use the composite module to load the bitmap output of SAM2 into the image pipeline. (first import the mask into a separate image and then use that as the source for composite).
Then you can create a raster mask on the output of composite (by selecting parametric mask and enabling "show output channels" in the blending hamburger menu). With the first output four-slider selector (gray value) select only portions of the mask image that are near black. (if you are interested in the bright areas of the mask image, you can probably use negadoctor on the imported mask to reverse that). That mask will also be used to blend the composite itself (i.e. the mask would get blended into your original image) which you don't want, but it doesn't matter if you add more black. So you have to select "addition" blend mode. If your mask selects only black, then adding that to your original image will not have an impact.
Toggle "display mask" to see if you got the correct raster mask.
Now in the module where you want to use the mask, select "raster mask" and as source of your mask select "composite".
What if....
- you have an image you want to mask, so you draw a crude path mask.
- I access the mask and get the points
- i have an exporter that exports the image to jpg and a points file for the sam2 engine
- the sam2 engine generates a good mask from the crude set of points and jpg and outputs the points to a file
- I import the file and replace the crude mask points with the sam2 points
I guess this fits the definition of dirty hack :-)
EDIT: What if...
- we had a module, sam2 whose only purpose was to have/hold the mask.
- you put it before the modules that need the mask so they can access it as a raster mask
- you could have multiple instances at different places in the pipeline.
- since the mask would truly be a mask, couldn't you just reuse the shape?
- we had a module, sam2 whose only purpose was to have/hold the mask.
or a more generic module allowing to hand over the content of the processed image a that state to a customizable app (e.g. python scripts to play around with arbitrary ai applications) and receive a raster mask (maybe also a processed image). Since an increasing numbers of ai models can be used in a local environment this won't restrict use cases.
very important to avoid performance issues: that shouldn't be updated automatically on changes in the pixel pipe; just on an explicit command by the user.
@dterrahe Honestly not a completely terrible idea haha. I can't see a way in the lua docs to script modules though... Will try to get it going over the weekend.
@wpferguson @MStraeten Having something like a module that can handle this type of input / output and is nicely scriptable would be ideal. I couldn't really find a way to make a custom darktable plugin that works on this level, would be happy to give it a shot otherwise. How would you approach this?
I also posted the same question on the darktable reddit and people generally really like the idea, I think there'd be quite some interest if there is a not too hacky way to integrate this
The boilerplate code for a module is src/iop/useless.c. Since all we are interested in is a mask, it might not need many changes.
Years ago AP created a module to call Krita from darkroom, pass an image, make changes in Krita, and return the changed data. It would be nice if someone had that code lying around.
Yeah that'd be amazing... How do people generally go about custom modules? Is it easy to have a github repo up where people can use the custom module? I haven't been able to find examples of this so far...
You can fork darktable repository and then play around. If it’s good enough for a field test, then you can do a pull request
Duplicate of https://github.com/darktable-org/darktable/issues/12295
This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.
Partially addressed by #18753.
This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.
Still interested by more integrated AI masks.