zachx121
zachx121
@helloworld53 hello, may you show me how to get this checkpoint of expressive? I tried this [forum](https://ai.meta.com/resources/models-and-libraries/seamless-downloads/) of meta, but after click the button 'Accept and continue' nothing happend, seems...
@ahaha721 hi, 我理解的处理流程如下: 1. “音素、语气、语调”:通过 `get_phones_and_bert`、`t2s_model.model.infer_panel` 处理后得到 `pred_semantic`; 2. “最终音频”:通过 `vq_model.decode` 合成`pred_semantic`对应的音频; 我想尝试的是: 1. 第二步 vq_model.decode 是不是可以解码一点返回一点,做成一个流式的? 2. 目前感觉前置步骤里基于bert预测语气肯定是要一整个文本一次性推理的,所以第一步应该是没有改成流式的可能性了?