descript-audio-codec (Paper Error?) MSD Not Used?

(Paper Error?) MSD Not Used?

Open zaptrem opened this issue 1 year ago • 3 comments

Your paper says "Like prior work, we use multi-scale (MSD) and multi-period waveform discriminators (MPD) which lead to improved audio fidelity."

However, by default it appears the trainer has an empty rates array which means no MSDs are initialized. Was this intentional?

Nov 27 '23 20:11 zaptrem

@ritheshkumar95 @eeishaan could you check this issue? It looks like published configs have empty rates list => no one MSD discriminator is initialized.

References: https://github.com/descriptinc/descript-audio-codec/blob/main/conf/final/44khz.yml#L16 https://github.com/descriptinc/descript-audio-codec/blob/main/dac/model/discriminator.py#L203

Dec 03 '23 15:12 Oktai15

@pseeth

Dec 03 '23 15:12 Oktai15

Hey, sorry - was on paternity leave. I've also somehow lost access to @pseeth, so continuing on my new account.

Hmm, yes, that's correct - MSD is not used in the final version of DAC. We should probably have said that we investigated the use of MSD and MPD. However, if you look at the ablation section of the paper, in Section 4.5 under "Discriminator", we mention that the use of the MSD, or single-scale discriminator did not help. We retain MPD only. In the table, you can see a similar result - the top row (MPD, and no scale discriminator) out-performs the last row of the discriminator section.

However, good point that we don't have an ablation with both MPD and MSD on. I do remember trying it though and not seeing any significant difference, with the added drawback of slower training. Hope this helps!

Dec 16 '23 22:12 seethlord

descript-audio-codec descript-audio-codec copied to clipboard

(Paper Error?) MSD Not Used?

descript-audio-codec
descript-audio-codec copied to clipboard