More details of the "baseline acoustic codec"

Open hbwu-ntu opened this issue 1 year ago • 1 comments

Hi, thank you for the amazing work. May I ask two questions about Table 1?

Could you please provide more detailed descriptions about the baseline acoustic codec in your paper? Does it

exclude the $S$ in the encoder side but still try to predict the $\hat{S}$
remove the entire blue block in Figure 1

About the Encodec and DAC, do you use their released ckpts, or do you train counterparts with the same dataset (LibriSpeech)?

Sep 02 '24 12:09 hbwu-ntu

Thank you for your interest and your questions!

1, Yes, the baseline acoustic codec excludes the entire blue block in Figure 1. 2, We used their released checkpoints for both Encodec and DAC. Looking forward to further discussions with you!

Sep 02 '24 12:09 zhenye234