heromanofe comments

Results 12 comments of


                                            heromanofe

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

Yea but I don't understand how VAD can fix.. random text detected. I will check what audio is recorded and report back.

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

I've noticed interesting thing, I have multi-lag model and it translates my speech when I think it shouldn't

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

speaking of which, I would be interested in self-generating those bin and tflite files or at least having some place where I can download other models. I will check in...

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

https://1drv.ms/u/s!AgXqUQNVnl-xmZ07Nq71pVUibaZUOg?e=blb6zR

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

Okay, you were sooo right :D I remembered that I looked into VAD before. I implemented this https://github.com/gkonovalov/android-vad into my project, using implementation 'org.tensorflow:tensorflow-lite-task-audio:0.4.0' implementation 'com.github.gkonovalov.android-vad:yamnet:2.0.4' and in your code:...

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

> Has your problem been solved? it was VAD problem, thou I wouldn't be celebrating for now. I noticed there is some speech it detected as silence instead :D I...

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

I don't know about app, all I did was this (in Recorder.java file) ![image](https://github.com/vilassn/whisper_android/assets/51365889/d63a580c-acb6-442e-acac-643e3568110e) ![image](https://github.com/vilassn/whisper_android/assets/51365889/1d8e0fd1-fed2-42df-9708-220613a3eac8) ![image](https://github.com/vilassn/whisper_android/assets/51365889/7b022fdd-1ec3-4400-b9e7-561485d09455)

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

Quick update about my situation, I decided to write kotlin code for real-time recognition. it works very simple, I am taking your recording system and just leaving out 1second chunks...

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

@matanel-6over6 scroll up for screenshots, here is library: https://github.com/gkonovalov/android-vad You need VAD and that was pretty good solution for me

I have an issue where when I am using real time transcription, when I am not talking, it seems like it parses random text.

you implement that library in gradle ( implementation 'org.tensorflow:tensorflow-lite-task-audio:0.4.0' implementation 'com.github.gkonovalov.android-vad:yamnet:2.0.4' ) and for this project, Recorder.java