whisper-asr-webservice icon indicating copy to clipboard operation
whisper-asr-webservice copied to clipboard

uploading certain audio files results in empty transcription

Open thet0ast3r opened this issue 2 years ago • 5 comments

Hello,

I have noticed following behavior that seems unintended:

using v1.1.0, either cpu or gpu:

Uploading an .m4a file with two audio channels results in an empty transcript, no error or anything. This happens when encoding is set to true.

I expected that the file would either be successfully decoded & transcribed or an error is returned. Reencoding to mp3 solves this problem but seems like an unnecessary complication.

request url: http://localhost:9000/asr?method=faster-whisper&task=transcribe&encode&output=json

response: {"language": "en", "segments": [], "text": ""}

Tested with base model & large-v2.

using docker logs does not show any errors.

edit:

I remuxed the file with -movflags faststart; and now it works. It seems that the same problem as in #42 is happening.

thet0ast3r avatar May 10 '23 09:05 thet0ast3r

I can conform that this is still an issue with .m4a, files short ones seem to work but when they get longer than about 20-30 seconds the API responds with and empty string...

Anyways I tested several file types locally on an Nvidia 3060 and here are the results of my tests:

Transcribing 1 minute of speech. image

EvarDion avatar Dec 31 '23 03:12 EvarDion

Same here. MP4 and M4A files not working. Tried with both encode=true and encode=false in the request. MP3 files worked fine.

encode=true = Blank response encode=false = 500 error

mattymcfatty avatar Jan 05 '24 17:01 mattymcfatty

I modified the code, fixed the error, and it seems that the run method of ffmpeg returns a tuple instead of a bytes-like type, which causes it to not work.

ixiami1314 avatar Jan 24 '24 05:01 ixiami1314

same problem here, with wav files

cheremovsky avatar Feb 16 '24 15:02 cheremovsky

the same here

LuisMalhadas avatar Mar 23 '24 14:03 LuisMalhadas

edit:

I remuxed the file with -movflags faststart; and now it works. It seems that the same problem as in https://github.com/ahmetoner/whisper-asr-webservice/issues/42 is happening.

Thanks for the update, works for me

ppenelon avatar Nov 10 '24 12:11 ppenelon