ChatTTS-Forge icon indicating copy to clipboard operation
ChatTTS-Forge copied to clipboard

[ISSUE] flash_attn f16 warning

Open zhzLuke96 opened this issue 7 months ago • 0 comments

你的issues

启用Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in LlamaModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via the `with torch.autocast(device_type='torch_device'):` decorator, or load the model with the `torch_dtype` argument. Example: `model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)`,有这个告警。同时启用--compile无法启动。
单独启用--compile ,自行触发shape预热预编译是什么意思,是指第一次生成语音比较慢对吗? 真正跑起来也没感觉快很多。

api 通过curl 调用,开启流式,产生的mp3文件是怎么流式获取?

Originally posted by @caixianyu in https://github.com/lenML/ChatTTS-Forge/issues/96#issuecomment-2217691408

- `flash_attn` 这个报错,有点奇怪,按道理说默认是开启半精度。这块的逻辑官方也才刚刚更新,我也才移植过来没几天,可能还有问题,得排查下

Originally posted by @zhzLuke96 in https://github.com/lenML/ChatTTS-Forge/issues/96#issuecomment-2219694246

zhzLuke96 avatar Jul 10 '24 06:07 zhzLuke96