wit icon indicating copy to clipboard operation
wit copied to clipboard

Speech - Support other Audio Formats

Open hactar opened this issue 8 years ago • 7 comments

Currently wit supports PCM (WAV), mp3, ulaw, and raw.

I'm working with the iOS framework, which uses PCM. I've tweaked it to use ulaw instead, and on slow internet connections (bad 3G and edge) it nearly doubles the speed of recognition (from 10 seconds down to nearly 5), but the sound quality is worse.

iOS does not support recording in mp3 format, it offers (https://developer.apple.com/library/ios/qa/qa1615/_index.html):

kAudioFormatAppleLossless (alac), kAudioFormatAppleIMA4 (ima4), kAudioFormatiLBC (ilbc), kAudioFormatULaw (ulaw), kAudioFormatALaw (alaw)

According to http://chinaxxren.iteye.com/blog/1750476 the best compression vs sound quality would be provided by ima4, or at least alac? ilbc might be an option too, I'm not a codec expert, but my guess is anything is better than pcm and ulaw.

Could you add ima4, alac or ilbc support to /speech? I'd update the iOS sdk accordingly.

hactar avatar Apr 28 '16 16:04 hactar

I believe the encoding ima-adpcm that we provide is actually ima4, but haven't had the chance to try it. Could you try and let us know?

blandinw avatar Apr 29 '16 20:04 blandinw

Thanks, didn't notice that. I've tried, but I can't get it to work, my stream is not recognised by the server, for example:

2016-05-04 03:01:41.528 Wave[384:35387] Wit response 200 (0.761610 s) { "msg_id" : "bb141d03-b343-4f2d-b78d-9c9d74605d30", "_text" : null, "outcomes" : [ ] }

I'm not saying this is a server side issue, could be that I'm setting up something wrong on my end. You can find my code in my wit-ios-sdk fork: https://github.com/hactar/wit-ios-sdk

In particular, the recorder: https://github.com/hactar/wit-ios-sdk/blob/master/Wit/WITRecorder.m#L216-L225 and the uploader: https://github.com/hactar/wit-ios-sdk/blob/master/Wit/WITUploader.m#L80-L82

The code can be triggered by adding formatToUse = kAudioFormatAppleIMA4; to https://github.com/hactar/wit-ios-sdk/blob/master/Wit/WITRecordingSession.m#L63

hactar avatar May 04 '16 01:05 hactar

I use WIT for a facebook bot and the user audio is sent as a mp4 file. It would be nice if you could accept that format as well. Currently I'm trying to covert from .mp4 to .wav before calling the wit api, but it causes a huge delay to answer the user. (download the audio => convert audio => understand audio => answer user)

JoabMendes avatar May 15 '17 14:05 JoabMendes

Any update if it supports mp4 since as @JoabMendes said, Facebook provides audio files as mp4 and the issue seems to have no update regarding this point whether it's been developed or on the roadmap. It's a long way to convert it to other formats which might cause problems and then reply to the user. And is there a way to accept mp4 with raw as a workaround for example? If so, what parameters should be provided?

And thanks in advance for your great work.

mohammad-melhem avatar Oct 08 '19 13:10 mohammad-melhem

We currently don't have any current plans to support mp4, but you're right that it is something we should consider. Thanks.

lycai-fb avatar Oct 11 '19 23:10 lycai-fb

We currently don't have any current plans to support mp4, but you're right that it is something we should consider. Thanks. Hi, So the mp4 format from FB messenger is audio and the codec used is AAC LC. So even using the content-type audio\raw and the parameters to define the encoding, the response is always empty. Both audio file and streaming not working

venturio1256 avatar Sep 01 '20 08:09 venturio1256

so any fast workarounds instead of converting the audio or other solutions to convert mp4 to text..?

EslamHiko avatar May 05 '21 04:05 EslamHiko

Closing due to no movement on the issue. Please re-open or file a new task should the issue be persisting.

Barbog avatar Apr 18 '23 09:04 Barbog

I agree. It is still not planned. No problem.

On Tue, Apr 18, 2023, 3:54 AM Barbog @.***> wrote:

Closed #217 https://github.com/wit-ai/wit/issues/217 as not planned.

— Reply to this email directly, view it on GitHub https://github.com/wit-ai/wit/issues/217#event-9033770478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWTZJTL3CZOTJCCCGRW2ITXBZQEHANCNFSM4CCIMYFQ . You are receiving this because you commented.Message ID: @.***>

venturio1256 avatar Apr 18 '23 14:04 venturio1256