VibeVoice icon indicating copy to clipboard operation
VibeVoice copied to clipboard

Floating point exception (core dumped)

Open zoushipeng opened this issue 4 months ago • 5 comments

when I run this: python demo/inference_from_file.py --model_path microsoft/VibeVoice-1.5B --txt_path demo/text_examples/1p_abs.txt --speaker_names Alice

I get some error, I find it occurs at VibeVoiceForConditionalGenerationInference/forward/logits = self.lm_head(hidden_states[:, slice_indices, :])

I print "hidden_states[:, slice_indices, :]" then I get tensor([[[ 0.4473, -1.0312, 0.4863, ..., 0.8867, 0.9414, -2.4688]]], device='cuda:0', dtype=torch.bfloat16)

so what's wrong ?

zoushipeng avatar Sep 01 '25 10:09 zoushipeng

Env? Error message?

YaoyaoChang avatar Sep 01 '25 10:09 YaoyaoChang

Looks like https://github.com/microsoft/VibeVoice/pull/72 fixed your problem — try it out and share how it goes!

YaoyaoChang avatar Sep 01 '25 11:09 YaoyaoChang

thanks very much. I will try it later~

zoushipeng avatar Sep 01 '25 11:09 zoushipeng

Quick tester note (for @zoushipeng and anyone who can repro)

Context. Issue #71 hits a “Floating point exception (core dumped)” on CUDA with torch.bfloat16, crashing at the lm_head call on sliced hidden states.

PR #72 adds two minimal guards:

  1. Validate slice_indices before slicing (so invalid/empty selections raise a clear Python error instead of a CUDA kernel FPE).
  2. If CUDA + bf16 on a GPU without native bf16 (SM < 80), upcast to float32 only at the lm_head boundary.

Try the fix

git fetch https://github.com/vatsalm1611/VibeVoice.git fix/cuda-bf16-fpe-guard:tmp-fpe-guard
git checkout tmp-fpe-guard

python demo/inference_from_file.py \
  --model_path microsoft/VibeVoice-1.5B \
  --txt_path demo/text_examples/1p_abs.txt \
  --speaker_names Alice

vatsalm1611 avatar Sep 01 '25 12:09 vatsalm1611

it not worked,see https://github.com/microsoft/VibeVoice/pull/72#issuecomment-3242375852

zoushipeng avatar Sep 01 '25 13:09 zoushipeng