Justin Uberti comments

Results 59 comments of


                                            Justin Uberti

Add 8-bit and 4-bit quant support

Still hitting these issues with v0.3: ``` # just infer --text_only --prompt hi -q 8 -m fixie-ai/ultravox-v0_3 poetry run python -m ultravox.tools.infer_tool --text_only --prompt hi -q 8 -m fixie-ai/ultravox-v0_3 config.json:...

Add 8-bit and 4-bit quant support

Notes from @farzadab: > The error "weight is on the meta device" means that you're likely running out of memory when doing the quantization somehow. The "meta" device is an...

Demo of emotion detection using BLSP-EMO approach

Deprioritized for the time being.

Experiment with speech/text interleaving

The goal here would just be to allow the model to see interleaved text during stage 1 training, which should help it learn text-audio invariance. So once we have the...

Support longer audio contexts

Interesting. Do you have a sample output dataset I could take a look at?

Support longer audio contexts

Hmm, I listened to a few clips and I wonder if the merging is the right way to do this. The audio clips tend to be fairly different with their...

Support longer audio contexts

OK, I can get behind that. I still think this warrants further investigation though: ``` eval/covost2-asr-es_en.2k-asr:0.14595425715933116 eval/covost2_long_audio-asr-combine-5-es_en.2k-asr:0.1345223909283106 eval/covost2_long_audio-asr-combine-10-es_en.2k-asr:0.20847972323659428 ``` It seems odd that combining 5x is much better than combining...

Justin Uberti

Add 8-bit and 4-bit quant support

Add 8-bit and 4-bit quant support

Demo of emotion detection using BLSP-EMO approach

Experiment with speech/text interleaving

Support longer audio contexts

Support longer audio contexts

Support longer audio contexts

Support longer audio contexts

Unified dataset specification

Please make Data Channels more suitable for realtime data