RecordingStudio icon indicating copy to clipboard operation
RecordingStudio copied to clipboard

Offline, privacy-respecting speech to text

Open RustoMCSpit opened this issue 1 year ago • 1 comments

Feature description

Speech-to-text transcription of audios that recognises multiple speakers. Able to see text of any audio by dropdown, or search bar, and exporting of all trascribed text as well.

Why do you want this feature?

would also be able to allow for a transcript so you could have a search bar and go through your voice recordings and you could click through the exact moment that word was said in the voice recordings. so if i typed 'adam' it may find 4 hits from the past 4 months: file191: 00:07 file179: 12:23, 16:30 file73: 06:42

you could then click on those moments to find the one youre looking for.

this could also be used for tagging, for example, if im working on a project called 'block runner' i could search for all mentions and tag them all easily

Additional information

Futo has partially delivered on this with an excellent FOSS solution: https://gitlab.futo.org/alex/voiceinput https://voiceinput.futo.org/

But the Futo solution currently works within other apps only and is not integrated directly into a voice recorder app. Adding Futo's speech-to-text capabilities to Simple Voice Recorder would make a voice recorded easily on par with Google's proprietary app.

https://github.com/FossifyOrg/Voice-Recorder/issues/34

RustoMCSpit avatar Nov 23 '24 23:11 RustoMCSpit

you should be able to see the transcript underneath the waveform and see it move along with it as the recording goes on, clicking on it would bring you to the full transcript which you can copy paste. the text should highlight the current word.

you could pair this with pitch detection and then have it do midi exporting and notation transcription with words attached

RustoMCSpit avatar Nov 24 '24 00:11 RustoMCSpit