VibeVoice issues

Feature Request: Add Support for Custom Pauses/Breaks

**Description:** Currently, generated audio using VibeVoice tends to be continuous, which can sound unnatural, especially for longer content like lectures or podcasts. It would be extremely helpful to have the...

gileneusz

Vibe voice

1

ANOS-amvs

Generated voice talking too fast

2

I am having this issue but am not able to understand exactly how to fix it using this suggestion on the github page - 'If you found the generated voice...

MREDZ7

Support for Intel XPUs in the demo

1

Would there be interest in modifying the demo code `demo/inference_from_file.py` to support [Intel XPUs](https://docs.pytorch.org/docs/stable/notes/get_start_xpu.html)? If so, I have it working [here](https://git.ayo.run/ayo/VibeVoice/commit/1cb2a50ce5954d5871e2556f6a97a2be81cdcf9c) and would be happy to open a PR. Thanks!...

ayoayco

How to use the model to transcribe and maintain rspeaker diariazation ( speech to text)

1

python demo/inference_from_file.py --model_path microsoft/VibeVoice-Large --txt_path demo/text_examples/2p_music.txt --speaker_names Alice Frank

allaabdella2-us

How can I deploy VibeVoice on my Android OS?

1

I really want it as my tts server.

adoin

Floating point exception (core dumped)

5

when I run this: python demo/inference_from_file.py --model_path microsoft/VibeVoice-1.5B --txt_path demo/text_examples/1p_abs.txt --speaker_names Alice I get some error, I find it occurs at VibeVoiceForConditionalGenerationInference/forward/logits = self.lm_head(hidden_states[:, slice_indices, :]) I print "hidden_states[:, slice_indices,...

zoushipeng

RuntimeError with DirectML backend (torch.cat concat in tokenizer)

3

I’m running VibeVoice on an AMD GPU with torch-directml as the backend, and I hit a runtime error during inference: ```bash \vibevoice\modular\modular_vibevoice_tokenizer.py", line 495, in _forward_streaming full_input = torch.cat([cached_input, x],...

J4R3LL

Random sample button just resizes and pops into the generate sample button and irrevocably destroys the input

Here you can see how it goes from generate to stop, then flashes generate long enough for you to want to click on it but then replaces it with "random...

kristopolous

Chinese voice

1

Hello, do you have a Chinese voice tone model?

Storyinsea

VibeVoice
VibeVoice copied to clipboard

Metadata

Feature Request: Add Support for Custom Pauses/Breaks

Vibe voice

Generated voice talking too fast

Support for Intel XPUs in the demo

How to use the model to transcribe and maintain rspeaker diariazation ( speech to text)

How can I deploy VibeVoice on my Android OS?

Floating point exception (core dumped)

RuntimeError with DirectML backend (torch.cat concat in tokenizer)

Random sample button just resizes and pops into the generate sample button and irrevocably destroys the input

Chinese voice

← Metadata

Owner

Metadata

VibeVoice VibeVoice copied to clipboard

Metadata

← Metadata

Owner

Metadata

VibeVoice
VibeVoice copied to clipboard