dicio-android icon indicating copy to clipboard operation
dicio-android copied to clipboard

[Feature Request]: Take dictation

Open uhyf3 opened this issue 3 years ago • 17 comments

It would be good if Dicio could take dictation, composing and modifying text.

uhyf3 avatar Jan 03 '22 12:01 uhyf3

If you add this feature, consider the New Note intent https://developer.android.com/guide/components/intents-common#NewNote

Edit: I tried dispatching this action myself, it turns out that most notes app ignore these intents, including Google Keep, Markor, and Simple Notes (From Fdroid). I will file an issue with Simple Notes, I presume Google Keep wouldn't care about such things

https://github.com/SimpleMobileTools/Simple-Notes/issues/492

hobbycommandline avatar Jan 04 '22 21:01 hobbycommandline

on further research, you can send plain text files to apps via that "open in" mechanism and much more apps support that, so even though its not the 'proper' way to do things, probably that'd be the recommended way

hobbycommandline avatar Jan 05 '22 03:01 hobbycommandline

If Simple Notes and Markor won't work, would Jota Text Editor work anyway? It has more features than Simple Notes, more simplicity than Markor, and has more ease-of-use and appears to be more stable than either. Google Keep defeats the purpose of being 100%-offline anyway. Jota Text Editor is quite lightweight, so incorporating it into dicio-android should not increase the footprint much. If the "storage" permission is a problem, it could be taken out and it's "Copy, Cut," & "Paste" features could carry the data, and along with the insertion point and selection, allow composition.

uhyf3 avatar Jan 05 '22 19:01 uhyf3

I have figured out how to do it in a way more apps recognize. You have to use Intent with ACTION_SEND with intent.setType("text/plain") and EXTRA_TEXT set to the message you want to send. This allows you not only to send it to apps like Google Keep, Simple Notes, and Markor, but also send it as a Tweet, or Discord Message. Unfortunately it does require you to use startActivity, which means the user will be forced to touch their phone and cannot complete the action solely through voice.

https://developer.android.com/reference/android/content/Intent#EXTRA_TEXT https://developer.android.com/reference/android/content/Intent#ACTION_SEND

hobbycommandline avatar Jan 06 '22 00:01 hobbycommandline

I only implemented this in my own app, but you're welcome to use my code. unfortunately for you I wrote it in scheme https://github.com/hobbycommandline/Hobby-Scheme-Command-Line/blob/master/app/src/main/assets/scheme/actions/note.scm#L43 but if you get stuck at all, my action does work so you can use it as a reference. there is a carve out in my readme that says Dicio is able to use any code from mine even though its GPL instead of AGPL finish-SEND is the method that dispatches the proper intent

startActivity which it calls does set a flag, and quit the app, which can be seen here https://github.com/hobbycommandline/Hobby-Scheme-Command-Line/blob/master/app/src/main/java/org/hobby/dispatcher/IntentDispatcher.kt#L50

hobbycommandline avatar Jan 06 '22 01:01 hobbycommandline

In order to have the fallback of listening to the spoken words a combination of recognized text and audiorecording could be send to whatever app. My main interest would be to compose an email to myself with the text inline and audio as attachment but that's just my todo workflow ;-)

muonIT avatar Jan 07 '22 17:01 muonIT

muonIT, there are standard formats for captioned audio / "subtitles" (The subtitles don't have to be in the same file as the audio.), in case that helps. I was hoping for something to compose interactively (as you would with a keyboard) though, so the audio wouldn't be needed unless doing a whole audio file.

uhyf3 avatar Jan 12 '22 10:01 uhyf3

It would be good if Dicio could take dictation, composing and modifying text.

Yeah, that would be a great addition. Thank you everyone for the information you collected, having the possibility to share text with (or talk directly to) the notes app or other apps should be considered.

Stypox avatar Jan 13 '22 13:01 Stypox

Might I suggest incorporating Jota Text Editor, a lightweight, easy to use, very stable and compatible FOSS text editor, to facilitate sending it text, & cursor control, selection, & editing commands it supports? (Currently, it supports them by gesture/taps & pushbutton.)

The Android keyboard interface may be another option.

My thanks also to everyone.

uhyf3 avatar Jan 13 '22 14:01 uhyf3

@Stypox Allowing for Dicio to be used as voice input in an app such as org.tasks would be great! Which I think are related to this Feature Request

thebiblelover7 avatar May 04 '22 13:05 thebiblelover7

As @uhyf3 stated, if you add the ability to provide textual input, please incorporate an editor that already exists so that you do not re-invent the wheel.

RokeJulianLockhart avatar May 04 '22 14:05 RokeJulianLockhart

I think two problems are being discussed here. Both would be welcome contributions, and if nobody does it before I reach that point, I would implement them, too.

  • Use dicio as a Speech-To-Text app. This can be done by exposing Dicio as an STT to the system, so that in theory it can be used by e.g. keyboards.
  • Creating a skill that supports dictation and creating notes. This should possibly be done also in tight coordination with the note taking app the user has as its default on its system. This way we don't have to reinvent the wheel (even copying over code from another app would partially be reinventing the wheel) and users will be happy since they would be able to use the notes app they like the best. I am not sure whether this is doable or not, as maybe there is no common interface for note apps.

Stypox avatar May 05 '22 22:05 Stypox

While I agree with @Stypox, I think both problems can be solved by point 1. Creating Notes would be easiest to be implemented with STT. I think every notes app would have to find a way to be involved with Dicio, for that to work. But again, just my thoughts.

thebiblelover7 avatar May 06 '22 03:05 thebiblelover7

An update for this: I found out that Athena is able to configure itself as a "Voice input" app. I couldn't find documentation about how that could be done online, but now I can look into Athena to see how they did it.

Stypox avatar May 06 '22 20:05 Stypox

I think two problems are being discussed here. Both would be welcome contributions, and if nobody does it before I reach that point, I would implement them, too.

  • Use dicio as a Speech-To-Text app. This can be done by exposing Dicio as an STT to the system, so that in theory it can be used by e.g. keyboards.e

That would be great!

  • Creating a skill that supports dictation and creating notes. This should possibly be done also in tight coordination with the note taking app the user has as its default on its system. This way we don't have to reinvent the wheel (even copying over code from another app would partially be reinventing the wheel) and users will be happy since they would be able to use the notes app they like the best. I am not sure whether this is doable or not, as maybe there is no common interface for note apps.

Especially if the system STT thing doesn't work out, "copying over code from another app [e.g. Jota Text Editor] would" be a good way to get standardized editing and saving features - notes apps often can't save on the filesystem, and rarely if ever can save in even 1 standard plain text format.

uhyf3 avatar May 12 '22 19:05 uhyf3

Another +1 for this.

I haven't seen any open source voice assistant on Android that have actually worked for me besides Dicio, and with offline Vosk no less! I'm really hoping these important features get ported!

But yes, sounds like getting this implemented to support the voice assistant button in keyboards is a good idea, too.

My question is, how would errors in dictation be handled? Example being that the STT system is not perfect, so if you wanted to remove the last word(s), navigate back in the sentence by words, or add special punctuation, how would that be done? I know, sorry... that sounds like a long term goal more than this. But it's unfortunately what most people used to VAs expect.

mrjpaxton avatar May 23 '22 12:05 mrjpaxton

My question is, how would errors in dictation be handled? Example being that the STT system is not perfect, so if you wanted to remove the last word(s), navigate back in the sentence by words, or add special punctuation, how would that be done?

You stop the dictation and manually edit visually on the touchscreen/by keyboard like any other app, right?

Anyways, another strong +1 here. I was shocked that it could listen to all of my words for all of these advanced commands but doesn't even have the most basic feature of transcribing my speech/saving what it captures instead of just deleting it every time. That's actually the only reason I'd use it; I don't care about the other features at all.

KeronCyst avatar Jun 20 '22 18:06 KeronCyst

Would you mind testing #109? Does it satisfy your needs?

Stypox avatar Dec 13 '22 11:12 Stypox

@Stypox Just did a quick test. The dictation, called from the settings menu in dicio, works very good and the share feature puts the recognized text right in the email message body - so very close to what I intended! Thank you very much for your efforts! They are much appreciated!! :smiley: :+1:

muonIT avatar Dec 13 '22 12:12 muonIT

Just to say I'd love to see something like this - would make Dicio even more useful! Currently I'm finding that this is not fixed by the pull request above. The only place I seem to be able to open Dicio as a navigation drawer and therefore am able to copy the text is in Firefox through the voice input option (and this also places the text straight into the search bar, creating a privacy risk). The notes app I use does not seem to have a voice input option, nor is my keyboard (gboard) recognising Dicio for voice input purposes. And, in line with other issues referenced by others, I don't seem to be able to set Dicio as a general purpose voice input. I wonder if a standalone skill for this might make sense as a workaround for users having issues with other parts of system integration. It could be kept minimal - it could be a 'copy to clipboard' skill and a 'share text to app' skill. Thanks for all your work on the app!

cannycartographer avatar Mar 15 '23 09:03 cannycartographer

It definitely doesn't register as a voice input provider.

Screenshot_20230422-003019

~~I simply want to be able to select text and have Dicio dictate that like Apple devices can with Siri~~ and use Dicio to type at a text box.

RokeJulianLockhart avatar Apr 21 '23 23:04 RokeJulianLockhart

I agree. The "composing and modifying" is needed before the text is sent to the other app. Perhaps there are exceptions, but usually, for security & privacy, it would be best to compose/modify before sending. More ways to send the text would be good too. (I particularly like the "copy text to clipboard," "save to plain text file," and 'save audio captions / "subtitles"' possible features.)

uhyf3 avatar Apr 21 '23 23:04 uhyf3

I see that this is closed as fixed, but I don't see the feature mentioned anywhere within the app? How do I use this? Thanks!

forteller avatar Sep 21 '23 00:09 forteller

@forteller you can press on the "speech to text service" button in the drawer

Stypox avatar Sep 21 '23 06:09 Stypox

Ah, I see. Thank you! Shouldn't there be a trigger word for this, though? :)

forteller avatar Sep 21 '23 09:09 forteller