ludwig Predictions on single audio data points from memory using python api

Predictions on single audio data points from memory using python api

Open Peetee06 opened this issue 3 years ago • 2 comments

Hi,

I want to build an application that gets audio data from a microphone audio stream and makes live predictions of chunks of 1 sec audio with 2 categories using a pre-trained ludwig model. As far as I know it is only possible to predict audio data using saved files in the filesystem. In order to make live predictions it would be much faster if there was a way to feed single python objects or wav encoded bytes to the model and let it predict the corresponding class one audio chunk at a time. Is there a way to do that with ludwig with as little overhead as possible?

Best regards, Peter

Apr 23 '21 07:04 Peetee06

@ Peetee06 right now unfortunately it is not possible, but I completely agree with you that it's a very much needed feature. We have a similar issue for the image features, as they behave similarly (both read from files right now). We'll make this a priority after we are done with some internal refactoring we are doing for v0.4 .

In the meantime, as you have already figured out, the solution is to save, maybe to a temp dir, the content into an audio file, then Ludwig will load it back, with the clear overhead this entails.

Apr 24 '21 00:04 w4nderlust

@w4nderlust looking forward to the feature. Will go with the workaround in the mean time as you suggested.

Thanks for the great work on Ludwig! :)

Apr 26 '21 08:04 Peetee06

ludwig ludwig copied to clipboard

Predictions on single audio data points from memory using python api

ludwig
ludwig copied to clipboard