flameshot Initial implementation of the plugin framework

[x] Plugin RFC #2529
[x] Printer support and save to PDF support

Jul 05 '24 09:07 jack9603301

This framework will introduce a new dependency: yaml-cpp

Jul 05 '24 10:07 jack9603301

@panpuchkov Can you take a look at this PR?

Sep 25 '24 18:09 jack9603301

If we go with a plugin system, why do we need to have multiple #ifdef USE_PRINTER_SUPPORT, it looks weird. If we don't want to have it as a built-in option and want to have a build option to include it or not, this is the case where it should be as a plugin.

Oct 02 '24 20:10 panpuchkov

If we go with a plugin system, why do we need to have multiple #ifdef USE_PRINTER_SUPPORT, it looks weird. If we don't want to have it as a built-in option and want to have a build option to include it or not, this is the case where it should be as a plugin.

In fact, this release has two features, one is plug-in support, and the other is a small patch that supports flameshot to send images directly to the printer.

You will find that these two are independent, and you are free to start either one.

Oct 10 '24 17:10 jack9603301

I feel like a reason this hasn't moved is because the PR is doing 2 things. It would probably be better to implement them one at a time.

Jan 13 '25 06:01 watzon

I agree that this needs to split the plugin from the printer support. So we can talk about both of those separately. I also can't understand how this implementation actually implements plugins. Where is the spec for what goes in the yaml file? How does a user implement a plugin?

May 13 '25 12:05 borgmanJeremy

@jack9603301 would you please separate these two parts (i.e., plugin system, and printer support) and keep in this PR the plugin system part, and put the printer in another PR so that we get the ball rolling.

May 13 '25 12:05 mmahmoudian

okay, i will

May 13 '25 12:05 jack9603301

I agree that this needs to split the plugin from the printer support. So we can talk about both of those separately. I also can't understand how this implementation actually implements plugins. Where is the spec for what goes in the yaml file? How does a user implement a plugin?

https://github.com/flameshot-org/plugins/blob/fa6ed42b829a1e01e4a661501c33caa272a3d282/Watermark/plugin.yaml.example

Here is the first plug-in implemented - the yaml configuration and code of the watermark plug-in

May 13 '25 13:05 jack9603301

Thanks for clarifying. I looked through that plugin and this code again and had the following thoughts:

1.) The general idea behind this seems solid. IE build a library per plugin, each plugin gets a yaml file to configure its own properties. I like all that.

2.) The plugin directory should be configurable at run time through the UI so the user can tell the application where to load plugins from.

3.) I think the way this works now is each plugin is essential part of the "save pipeline" so each effect from each plugin will be added the image before saving. I would rather have a "plugin button" that pops up all loaded plugins, and then the user can select which plugin to apply.

3a.) Those plugins can either be "final actions" like saveWithWatermark in which case they will apply whatever formatting to the image and then terminate. another example would be save to PDF or print, both of those can be final action plugins

3b.) Or they can be "image actions" like if someone wants to implement a custom arrow. In that case the plugin should return a layer with the custom arrow or whatever.

May 13 '25 22:05 borgmanJeremy

Thanks for clarifying. I looked through that plugin and this code again and had the following thoughts:

1.) The general idea behind this seems solid. IE build a library per plugin, each plugin gets a yaml file to configure its own properties. I like all that.

2.) The plugin directory should be configurable at run time through the UI so the user can tell the application where to load plugins from.

3.) I think the way this works now is each plugin is essential part of the "save pipeline" so each effect from each plugin will be added the image before saving. I would rather have a "plugin button" that pops up all loaded plugins, and then the user can select which plugin to apply.

3a.) Those plugins can either be "final actions" like saveWithWatermark in which case they will apply whatever formatting to the image and then terminate. another example would be save to PDF or print, both of those can be final action plugins

3b.) Or they can be "image actions" like if someone wants to implement a custom arrow. In that case the plugin should return a layer with the custom arrow or whatever.

Good idea, this is an initial implementation of an immature plugin system, are you willing to work with us to improve it?

A complete plugin system may require a lot of APIs and integrate with flameshot in a perfect way.

May 14 '25 17:05 jack9603301

@jack9603301

A complete plugin system may require a lot of APIs and integrate with flameshot in a perfect way.

I think we can start from some parts and expose ports to plugins eventually.

I think what @borgmanJeremy suggested is a very nice separation between the two types of plugins, and we can achieve a lot by:

accepting layers from plugins and displaying them (Jeremy's 3b)
Sending the screenshot to plugins as final actions (3a)

For now i believe the Print and PDF export are doing the 3a. So that part is almost done. When that is merged, we can move to implement 3b.

May 14 '25 17:05 mmahmoudian

@jack9603301

A complete plugin system may require a lot of APIs and integrate with flameshot in a perfect way.

I think we can start from some parts and expose ports to plugins eventually.

I think what @borgmanJeremy suggested is a very nice separation between the two types of plugins, and we can achieve a lot by:

accepting layers from plugins and displaying them (Jeremy's 3b)

Sending the screenshot to plugins as final actions (3a)

For now i believe the Print and PDF export are doing the 3a. So that part is almost done. When that is merged, we can move to implement 3b.

Yes, the key is that 3b also has its second item, which requires modifying the core code to execute the plug-in logic more deeply. We may also need to add some general plug-in logic and interfaces so that the plug-in manager can distinguish the types of different plug-ins, and then intelligently execute various logics based on specific types and interface definitions - because plug-ins will become more and more complex. This is essentially another implementation of Windows COM component technology. The role of the plug-in manager means that its logic will become more and more complex.

My ability is limited, and I can't spend all my time here, so I need the help of developers and other senior developers.

May 14 '25 17:05 jack9603301

Yeah I can help. Give me a few days to draw it out in more detail.

May 14 '25 23:05 borgmanJeremy

Yeah I can help. Give me a few days to draw it out in more detail.

sure, thanks you!

May 15 '25 10:05 jack9603301

Sending the screenshot to plugins as final actions (3a)

Does this still include a message possibly displayed by flameshot? I am just trying to understand the plugin system from the standpoint of a custom uploader.

If a screenshot is uploaded, the URL still needs to be displayed somehow. While a copy URL button would be nice in such a "result window", being able to select and copy the URL is also fine. Otherwise you end up with uploaded images, but you don't know how to retrieve or share them. ;-) Other plugins also might need flameshot to display a message, e.g. success, error, some other relevant info...

I'm also not entirely clear how the plugins themselves are written. In the watermark example above, the plugin is a shared object. How is the image "sent" to the plugin? Via stdin or via a tempfile (e.g. the plugin receives a path/file)? While using shared objects is certainly an option, this also means plugins are limited to a specific OS, unless the plugin dev distributes multiple binaries.

Sorry for my questions. Just trying to wrap my head around this.

May 15 '25 22:05 tessus

Sending the screenshot to plugins as final actions (3a)

Does this still include a message possibly displayed by flameshot? I am just trying to understand the plugin system from the standpoint of a custom uploader.

If a screenshot is uploaded, the URL still needs to be displayed somehow. While a copy URL button would be nice in such a "result window", being able to select and copy the URL is also fine. Otherwise you end up with uploaded images, but you don't know how to retrieve or share them. ;-) Other plugins also might need flameshot to display a message, e.g. success, error, some other relevant info...

I'm also not entirely clear how the plugins themselves are written. In the watermark example above, the plugin is a shared object. How is the image "sent" to the plugin? Via stdin or via a tempfile (e.g. the plugin receives a path/file)? While using shared objects is certainly an option, this also means plugins are limited to a specific OS, unless the plugin dev distributes multiple binaries.

Sorry for my questions. Just trying to wrap my head around this.

It is actually passed through the function pipeline

May 16 '25 06:05 jack9603301

It is actually passed through the function pipeline

I am sorry, I have no idea what that means. I have read the README.md but that's about it. I don't know about any internals of flameshot.

May 16 '25 20:05 tessus

It is actually passed through the function pipeline

I am sorry, I have no idea what that means. I have read the README.md but that's about it. I don't know about any internals of flameshot.

The current working principle of the plugin manager is actually a pipeline. When flameshot saves an image, it will execute a trigger based on the callback function mechanism, and this trigger function of the plugin manager will access all plugin interfaces in turn and process them until the instructions cannot be processed further, and then the final image will become the final effect saved. During the whole process, the image will be passed as a function parameter in the stack memory.

May 17 '25 11:05 jack9603301

@jack9603301 thanks for the info. Unfortunately that means that all installed plugins are applied. Now the question is in which order. eg: the watermark plugin.. would it be applied before or after a custom upload plugin (or a few other plugins)? The workflow seems rather unintuitive and even dangerous. Shouldn't a user be able to decide which plugin to apply? If not, this fact should be stated very clearly in the docs. But this basically means that users can (in many situations) install and use only one plugin at a time.

I'm not saying that a pipeline is not useful, I'm just saying it should be a choice.

So, it seems the plugin has to be written in C or C++. Not an issue for me, but as mentioned before this is not very OS agnostic. This makes the ecosystem less unified and I also suspect that a few people might be scared off by the C/C++ factor.

If the image is passed as a function parameter, does this mean a binary representation or in an encoded form (base64, uuencoded), also in which image format? Is there a documentation for how to write a plugin?

P.S.: As an additional question: so a popup window to show the URL would have to be handled by the custom uploader plugin, is that correct? This would mean that all plugins are run synchronously and sequentially.

May 17 '25 22:05 tessus

@tessus maybe this is the right time for me to join in and attract your attention to this RFC in case you have not seen it already:

https://github.com/flameshot-org/flameshot/discussions/2529

Some part of the points you mentioned are covered in this RFC discussion (I'm not implying that this PR is exactly implementing that RFC), but I also now see some points that you raised that are not in that RFC discussion either.

@borgmanJeremy @jack9603301 @tessus

For the lack of better terminology, let's break down plugins into two categories:

Augmentations: those plugins that add a layer to the layer stack during the annotation while in flameshot gui (e.g., adding shapes and arrows, adding stickers
Final Actions: those which apply certain modifications or do some action to the final cropped image (e.g., adding border, watermark, printing, upload, OCR, barcode reader)

The Augmentations are simple because they add a layer and the user has full control over their execution order. The Final Actions are the issue. They take the raster image and do something. Let's assume we have watermark and upload. Logically we want to apply the watermark first and then upload. This is where I'm not clear how we should implement it from UX perspective. Should we display a graphical window to user and allow them to build a flowchart (or order of execution)? This can become complicated, overwhelming, and time consuming if the user want to do it every single time. Should we mandate the plugins to declare if they are literally the end-point action (e.g., printing) or middle-point (e.g., watermark), and then:

if user clicks on end-point action, that gets executed right away.
if user click on middle-point, we allow the user to click as many middle-points they want in any order they want, and then wait for an end-point to be clicked, then execute them in the clicked order?

If so, we can leverage the side-panel area (e.g., new tab there) and show the user the sequence of execution. This way they can change the order and delete unnecessary steps and then click "Go"?

Thankfully we have the side-panel area that can be easily used for these things.

I wonder how other screenshot tools like ShareX are handling this.

May 17 '25 23:05 mmahmoudian

attract your attention to this RFC in case you have not seen it already:

Ah, nice. Thanks. Not sure how I could have missed this.

Should we mandate the plugins to declare if they are literally the end-point action (e.g., printing) or middle-point (e.g., watermark)

This might be a problem, because certain plugins could be both. But your other idea to the rescue:

if user clicks on end-point action, that gets executed right away.

if user click on middle-point, we allow the user to click as many middle-points they want in any order they want, and then wait for an end-point to be clicked, then execute them in the clicked order?

Maybe we could agree on a convention: e.g. if a user clicks on an icon/plugin it is considered an endpoint action and is executed right away.

Using the side-panel is a great idea. If a user has multiple plugins that should be run, some interaction has to be initiated anyway: the user has to decide the order and which plugins to use. In such a case the user could just add x plugins to a "plugin pipeline" tab and these plugins are execututed in order, which means the last one in the list is the end-point.

Does this make sense?

May 17 '25 23:05 tessus

@jack9603301 thanks for the info. Unfortunately that means that all installed plugins are applied. Now the question is in which order. eg: the watermark plugin.. would it be applied before or after a custom upload plugin (or a few other plugins)? The workflow seems rather unintuitive and even dangerous. Shouldn't a user be able to decide which plugin to apply? If not, this fact should be stated very clearly in the docs. But this basically means that users can (in many situations) install and use only one plugin at a time.

I'm not saying that a pipeline is not useful, I'm just saying it should be a choice.

So, it seems the plugin has to be written in C or C++. Not an issue for me, but as mentioned before this is not very OS agnostic. This makes the ecosystem less unified and I also suspect that a few people might be scared off by the C/C++ factor.

If the image is passed as a function parameter, does this mean a binary representation or in an encoded form (base64, uuencoded), also in which image format? Is there a documentation for how to write a plugin?

P.S.: As an additional question: so a popup window to show the URL would have to be handled by the custom uploader plugin, is that correct? This would mean that all plugins are run synchronously and sequentially.

Of course, what you said makes sense, but this is just a preliminary implementation, and there is still a lot of work to be done to fully implement the plugin framework.

In addition, from the perspective of the plugin framework, it supports multiple types of plugins, but currently only one interface of the qt plugin type is implemented.

So, if you have better suggestions, you are welcome to

May 18 '25 06:05 jack9603301

Should we mandate the plugins to declare if they are literally the end-point action (e.g., printing) or middle-point (e.g., watermark), and then:

Maybe we can add priorities to plugin declarations? Too much configuration in yaml code

If so, we can leverage the side-panel area (e.g., new tab there) and show the user the sequence of execution. This way they can change the order and delete unnecessary steps and then click "Go"?

This seems like good advice.

May 18 '25 06:05 jack9603301

Using the side-panel is a great idea. If a user has multiple plugins that should be run, some interaction has to be initiated anyway: the user has to decide the order and which plugins to use. In such a case the user could just add x plugins to a "plugin pipeline" tab and these plugins are execututed in order, which means the last one in the list is the end-point.

Does this make sense?

I like this design, but it may need a more complex implementation, and perhaps more developers to get involved and implement it. The plugin system is probably the most complex implementation flameshot has ever had, because its full implementation means deep integration, and as I said, it is essentially implementing Windows COM technology in another way.

May 18 '25 06:05 jack9603301

@mmahmoudian Should we set up a separate branch and separate group of volunteer developers to implement and discuss it?

May 18 '25 07:05 jack9603301

@jack9603301 I don't know. I believe your PR is a great starting point and needs some modifications. As far as I understand, the core functionality is there and we need to make the UI/UX user friendly. I also would rather have you and your opinion on the plugin system too as you are the first person who's PR is the closest to a plugin system implementation and you have been working with Flameshot codebase for many months now.

@borgmanJeremy what's your opinion?

@tessus

Maybe we could agree on a convention: e.g. if a user clicks on an icon/plugin it is considered an endpoint action and is executed right away.

I assume you are referring to endpoint plugins?

If a user has multiple plugins that should be run, some interaction has to be initiated anyway: the user has to decide the order and which plugins to use. In such a case the user could just add x plugins to a "plugin pipeline" tab and these plugins are execututed in order, which means the last one in the list is the end-point.

This makes sense for example if the user wants to print && upload && save && pin ... that the last endpoint action is the terminal one.

Also, should we disable the endpoint plugins if the user has already specified a final action in CLI? For example flameshot gui --clipboard.

That said, perhaps in the not so distant future users will want to use the endpoint plugins from CLI as well. If that's the case, then we should:

Mandate a plugin's unique name in their yaml file (and we should somehow handle name collision, perhaps at the time of installing a plugin)
Allow something like flameshot gui --endpoint-plugins=watermark,upload,print --clipboard.

But I think CLI is related but peripheral to this thread. We can focus on this for now.

May 18 '25 07:05 mmahmoudian

@jack9603301 I don't know. I believe your PR is a great starting point and needs some modifications. As far as I understand, the core functionality is there and we need to make the UI/UX user friendly. I also would rather have you and your opinion on the plugin system too as you are the first person who's PR is the closest to a plugin system implementation and you have been working with Flameshot codebase for many months now.

Yes, it is a starting point, but there is still a lot of implementation to be done

May 18 '25 07:05 jack9603301

@jack9603301

Yes, it is a starting point, but there is still a lot of implementation to be done

Yes, but as you said, this is one of the largest additions to this project, so we can release it as experimental and have it disabled by default. This way we can gradually implement different parts.

I think what we need now is a clear plan. The RFC needs to be revised to reflect the discussions here and in the original RFC, in order to show us the path. Also, let's wait for @borgmanJeremy opinions and if he has sketched something.

I truly like to see this plugin system because it will give us more opportunity to focus on screenshot rather than imgur issues. It would be super extra great if we can support Python and Lua to lower the barrier to entry for users.

May 18 '25 07:05 mmahmoudian

Yes, but as you said, this is one of the largest additions to this project, so we can release it as experimental and have it disabled by default. This way we can gradually implement different parts.

Yes, I agree

I think what we need now is a clear plan. The RFC needs to be revised to reflect the discussions here and in the original RFC, in order to show us the path. Also, let's wait for @borgmanJeremy opinions and if he has sketched something.

Yes, let's wait.

May 18 '25 08:05 jack9603301