zed icon indicating copy to clipboard operation
zed copied to clipboard

Dictation for Assistant (inline and chat-based)

Open mknw opened this issue 1 year ago • 2 comments

Check for existing issues

  • [X] Completed

Describe the feature

I started using the inline assistant tool more and more, for both longer and shorter tasks. There are a lot of advantages in it.

I would like to be able to dictate to the agent, while seeing what I type and being able to approve/edit the input before submitting the instructions to the LLM model.

This could be visualised in the inline assistant element (⌃ctrl+↩return) as a microphone icon next to the "configure" icon to the left. It would have the benefit to shorten the time to give directions to the assistant, while providing one with the ability to adjust the input to correct variable naming and similar mistakes.

Additionally, I would suggest adding keyboard shortcuts to:

  1. start inline assistant dictation (starting the inline assistant tool as it work right now, but with the microphone already turned on for dictation); and
  2. approve the input (in the case the dictation is correct), without having to use the trackpad/mouse. But for this purpose, the original enter could get the job done. Perhaps an additional key could help in ensuring that the dictation is not submitted mistakenly (eg. shift+enter).

Nice to have

It would be nice if dictation itself was aware of the context (instead of only submitting the context while submitting the prompt). This would make it easier for the Voice-to-Text model to pick up variable names and other code specific keywords.

mknw avatar Aug 17 '24 15:08 mknw

Very interesting use!

If you are running on MacOS you can do some of this today with the built-in speech to text. Make sure dictation is enabled under System Preferences -> Keyboard -> Dictation (toggle to on) and set a dictation keyboard shortcut if desired (defaults to Fn+F5).

Then just trigger the diction shortcut (Fn+F5) and talk into the assistant. You can freely edit before submitting to the model.

A more integrated approach would likely leverage one or more of the following:

notpeter avatar Aug 23 '24 17:08 notpeter

I would love to see this too, I use Advanced voice mode so often on my phone to explore complex problem spaces, and I'd love to be able to have a voice-first workflow for also working with an LLM in the chat window.

I've actually been using the MacOS built in dictation for a while, but it pales in comparison to OpenAI's models - it's too error prone & flaky, it breaks the flow.

I wonder if instead Zed could incorporate various voice to text models using a similar model picker... and even allow local models from ollama etc.

lougreenwood avatar Dec 11 '24 14:12 lougreenwood

I enjoy Voice Support in VSCode and wish Zed had something similar.

Of course it's possible to use external dictation software like SuperWhisper

wcygan avatar Jun 05 '25 07:06 wcygan