metavoice-src icon indicating copy to clipboard operation
metavoice-src copied to clipboard

few errors while loading tts in colab notebook

Open muralidhar972 opened this issue 1 year ago • 6 comments

Hello team,

an interesting repo to experiment on voice. came across Few issue, while experimenting tts in colab.

  1. tyro, julius are available in poetry file and while the tts it ask to install again
  2. below is the screen shot where we unable to load the tts image

Above 2 are from colab file.

looking for your reply..

Keep rocking..

muralidhar972 avatar Mar 15 '24 13:03 muralidhar972

@lucapericlp is looking into it

sidroopdaska avatar Mar 18 '24 16:03 sidroopdaska

Hey @muralidhar972, thanks for the issue! We rolled out poetry support recently but we didn't update the colab link. I had a quick look at this but I'm struggling to get things running on a T4 since the runtime keeps getting disconnected. If you want to have a go until I'm able to further debug, I'm using this updated notebook to test things out. I'll post an update here once we've merged the finalised version into main.

lucapericlp avatar Mar 18 '24 16:03 lucapericlp

Hey @lucapericlp, i actually tried your updated notebook but i have dependency issues between audiocraft and torch, torchvision etc.

nassimabenammar avatar Mar 22 '24 14:03 nassimabenammar

Still not working, your's isn't either, when's the fix coming?

SecretiveMonkey avatar Mar 23 '24 15:03 SecretiveMonkey

@lucapericlp Hey!

metavoice is amazing! :) Unfortunately the colab does not work for me. I always get this error:

image

TorchRuntimeError: Failed running call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., device='cuda:0', size=(2, 16, s0, 128)), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16)), **{'attn_mask': FakeTensor(..., device='cuda:0', size=(1, 1, s0, 2048), dtype=torch.bool), 'dropout_p': 0.0}):
Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: c10::Half and value.dtype: c10::Half instead.

from user code:
   File "/content/metavoice-src/fam/llm/fast_inference_utils.py", line 131, in prefill
    logits = model(x, spk_emb, input_pos)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/metavoice-src/fam/llm/fast_model.py", line 160, in forward
    x = layer(x, input_pos, mask)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/metavoice-src/fam/llm/fast_model.py", line 179, in forward
    h = x + self.attention(self.attention_norm(x), mask, input_pos)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/metavoice-src/fam/llm/fast_model.py", line 222, in forward
    y = F.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0)

Any advice would be highly appreciated!

sebastianrueckerai avatar Mar 27 '24 23:03 sebastianrueckerai

Hey @sebastianrueckerai, thanks for the enthusiasm! Sorry that you're bumping into this issue, we're tracking it here: https://github.com/metavoiceio/metavoice-src/issues/108. There's a temporary fix in the issue linked there which you can try out in the meantime.

lucapericlp avatar Mar 28 '24 07:03 lucapericlp