VoiceInk icon indicating copy to clipboard operation
VoiceInk copied to clipboard

Add workflow manager

Open maelp opened this issue 1 year ago • 20 comments

Add a simple workflow manager (see https://github.com/Beingpax/VoiceInk/issues/17)

maelp avatar Mar 18 '25 11:03 maelp

It works :)

CleanShot 2025-03-18 at 13 08 41@2x

Basically:

  • we add a configuration pane where people can add their workflows
    • a title (unused, but just for clarity)
    • a prompt (which will be given to the LLM)
    • an expected JSON output format (it will become WORKFLOW_ARGS in the shell script)
    • a bash script to run

then this will run the bash script with the following ENV variables set

WORKFLOW_ARGS contains the output JSON

and for convenience, for each key "myKey" of WORKFLOW_ARGS, a WORKFLOW_ARG_MYKEY env with the corresponding JSON value

maelp avatar Mar 18 '25 12:03 maelp

CleanShot 2025-03-18 at 13 12 06@2x

maelp avatar Mar 18 '25 12:03 maelp

Note: the script must be set to executable chmod u+x script.sh

maelp avatar Mar 18 '25 12:03 maelp

https://github.com/user-attachments/assets/b417b9f5-099b-4172-b151-d0a961eaed2a

maelp avatar Mar 18 '25 12:03 maelp

@Beingpax there's even a responseFormat field (at least for ChatGPT), this could be useful to force the result to really be JSON, and avoid the LLM making mistakes (but I don't know if there are the equivalent for other LLMs... at least it could make it better for people using ChatGPT) https://github.com/MacPaw/OpenAI?tab=readme-ov-file

CleanShot 2025-03-18 at 19 11 35@2x

CleanShot 2025-03-18 at 19 11 46@2x

maelp avatar Mar 18 '25 18:03 maelp

https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat

maelp avatar Mar 18 '25 18:03 maelp

It's also available for Ollama https://ollama.com/blog/structured-outputs and Anthropic https://docs.anthropic.com/en/docs/test-and-evaluate/strengthen-guardrails/increase-consistency

maelp avatar Mar 18 '25 18:03 maelp

You did some crazy work here @maelp .

I'm adding some comments.

Beingpax avatar Mar 19 '25 08:03 Beingpax

Thank you!

It might actually need some cleaning / correction (in particular the JSON parsing and validation and error handling) because I (mostly) did it with Claude which isn’t perfect, it was meant as a POC to show you how your VoiceInk could become a full-fledged AI assistant quite easily!

Thanks for your great app!

maelp avatar Mar 19 '25 09:03 maelp

Hi @maelp , this is my first time collaborating on Github open-source project. And also a fairly new developer.

Will you make the changes mentioned in the comments or should I make them myself? I'm a little confused. Sorry for that. 😁

Beingpax avatar Mar 19 '25 14:03 Beingpax

@Beingpax we can collaborate on it! But since you're the main maintainer, I'm not exactly sure what's your preferred approach to coding / what kind of libs you want to use / how do you want to integrate stuff

I think you could review and tell me what things you like / don't like in the code, and then depending on whether I know how to do them myself I'll work on it, and if I have a question / need your help I will ask

maelp avatar Mar 19 '25 14:03 maelp

also I added a IS_DEVELOPMENT compiler flag to compile the "license" feature only in Release so I can dev without nagging... but perhaps you have a better way to do this? Not sure how you do development on your laptop yourself?

maelp avatar Mar 19 '25 14:03 maelp

@Beingpax we can collaborate on it! But since you're the main maintainer, I'm not exactly sure what's your preferred approach to coding / what kind of libs you want to use / how do you want to integrate stuff

I think you could review and tell me what things you like / don't like in the code, and then depending on whether I know how to do them myself I'll work on it, and if I have a question / need your help I will ask

Suree. That would be awesome. I think talking on Discord would be good?

Beingpax avatar Mar 19 '25 14:03 Beingpax

Perfect, add me there, should be mael_oulipo

Message ID: @.***>

maelp avatar Mar 19 '25 14:03 maelp

also I added a IS_DEVELOPMENT compiler flag to compile the "license" feature only in Release so I can dev without nagging... but perhaps you have a better way to do this? Not sure how you do development on your laptop yourself?

I have removed the keys for License verification on the open-source code. There are very few restrictions on the app. Even if you do not pay the app, even after 7 days, it will just work as it is. Except it will add that little text at the beginning of all of your transcript results. Please purchase VoiceInk Pro. For me, it just works as it is, because the app inherits the settings from the installed version.

Maybe for you would be a different case because of using a different bundle identifier for the Xcode build.

Beingpax avatar Mar 19 '25 15:03 Beingpax

@Beingpax , @maelp I was just about to suggest something similar — great to see you’re already on it! One small idea: instead of only supporting shell scripts, it could be even more powerful to allow for different script types or tool integrations. Some tasks go beyond what Bash alone can handle, so that flexibility would really add value. And on a side note — it’s pretty amazing to be leaving this comment using Voice Ink! If there’s any way I can contribute, I’d be happy to help.

Mgajurel avatar Jun 13 '25 07:06 Mgajurel

I haven't had the time to really complete this task, so now the code has drifted enough that I guess it would make more sense to scrap this PR and do a fresh implementation @Beingpax ?

I'm not sure when I will have time to do this

@Mgajurel using bash script, you can then start any other script you'd like, so it's not a big deal, eg your script could just do

# myscript.sh
python mycode.py $@ # run a python script, giving it all the arguments, etc

maelp avatar Jun 13 '25 08:06 maelp

I haven't had the time to really complete this task, so now the code has drifted enough that I guess it would make more sense to scrap this PR and do a fresh implementation @Beingpax ?

I'm not sure when I will have time to do this

@Mgajurel using bash script, you can then start any other script you'd like, so it's not a big deal, eg your script could just do

# myscript.sh
python mycode.py $@ # run a python script, giving it all the arguments, etc

Got it — that makes sense. @maelp

What I’m really aiming to highlight is the potential of tools like MCP, which go beyond just script execution. They allow direct integrations with platforms like Jira, Postman, or even Slack to automate specific workflows.

For example, instead of calling a shell script to run a Python file, we could use MCP to trigger something like “create a new invite in Slack” or “send a request through Postman” — making these tasks more native and integrated. That’s the kind of flexibility I think could really add value here.

Mgajurel avatar Jun 13 '25 08:06 Mgajurel

Sure, that’s also doable through bash script (basically have a MCP server running on your laptop, and just have a bash script piping your query directly to the script)

maelp avatar Jun 13 '25 08:06 maelp

I'm not currently working on this, but I'd be happy if anyone would like to contribute! @Mgajurel @maelp

Beingpax avatar Jun 13 '25 16:06 Beingpax