metavoice-src few errors while loading tts in colab notebook

Hello team,

an interesting repo to experiment on voice. came across Few issue, while experimenting tts in colab.

tyro, julius are available in poetry file and while the tts it ask to install again
below is the screen shot where we unable to load the tts

Above 2 are from colab file.

looking for your reply..

Keep rocking..

Mar 15 '24 13:03 muralidhar972

@lucapericlp is looking into it

Mar 18 '24 16:03 sidroopdaska

Hey @muralidhar972, thanks for the issue! We rolled out poetry support recently but we didn't update the colab link. I had a quick look at this but I'm struggling to get things running on a T4 since the runtime keeps getting disconnected. If you want to have a go until I'm able to further debug, I'm using this updated notebook to test things out. I'll post an update here once we've merged the finalised version into main.

Mar 18 '24 16:03 lucapericlp

Hey @lucapericlp, i actually tried your updated notebook but i have dependency issues between audiocraft and torch, torchvision etc.

Mar 22 '24 14:03 nassimabenammar

Still not working, your's isn't either, when's the fix coming?

Mar 23 '24 15:03 SecretiveMonkey

@lucapericlp Hey!

metavoice is amazing! :) Unfortunately the colab does not work for me. I always get this error:

TorchRuntimeError: Failed running call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., device='cuda:0', size=(2, 16, s0, 128)), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16)), **{'attn_mask': FakeTensor(..., device='cuda:0', size=(1, 1, s0, 2048), dtype=torch.bool), 'dropout_p': 0.0}):
Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: c10::Half and value.dtype: c10::Half instead.

from user code:
   File "/content/metavoice-src/fam/llm/fast_inference_utils.py", line 131, in prefill
    logits = model(x, spk_emb, input_pos)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/metavoice-src/fam/llm/fast_model.py", line 160, in forward
    x = layer(x, input_pos, mask)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/metavoice-src/fam/llm/fast_model.py", line 179, in forward
    h = x + self.attention(self.attention_norm(x), mask, input_pos)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/metavoice-src/fam/llm/fast_model.py", line 222, in forward
    y = F.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0)

Any advice would be highly appreciated!

Mar 27 '24 23:03 sebastianrueckerai

Hey @sebastianrueckerai, thanks for the enthusiasm! Sorry that you're bumping into this issue, we're tracking it here: https://github.com/metavoiceio/metavoice-src/issues/108. There's a temporary fix in the issue linked there which you can try out in the meantime.

Mar 28 '24 07:03 lucapericlp

metavoice-src metavoice-src copied to clipboard

few errors while loading tts in colab notebook

metavoice-src
metavoice-src copied to clipboard