FastDiff
FastDiff copied to clipboard
Question about noise scheduling process.
Hello I'm trying to implement noise scheduling process refer to BDDM's implementation BDDM/sampler.py
And I have some question for noise scheduling process for FastDiff-TTS.
-
In the Fastdiff paper, the alphaN, betaN is set as hyperparameter like
αˆt = 0.54, βˆt = 0.70
. Can I use this hyper parameter for my own Fastdiff-TTS module or another number of reverse steps(ex) 6, 8, 10...)? How does it Calculated? -
For BDDM, searching alphaN, betaN requires some greedy searching with search_bin=9, and further searching step=10 for adding noise for params. ex)
_alpha_param = alpha_param * (0.95 + np.random.rand() * 0.1)
Dose Fastdiff requires similar process like above? -
For BDDM, STOI and PESQ is estimated for generated audio to find best noise schedule. How could we select best parameters based on two indicators STOI and PESQ?
-
Are STOI and PESQ also needed for parameter searching process for Fastdiff?
-
In BDDM,
num_reverse_steps = math.floor( T / tau )
. But in Fastdiff, T=1000, tau=200 and num_reverse_steps=4. Do I need to calculate num_reverse_steps bymath.floor(T/tau) - 1
?
Thank you.
Hi,
- It's OK to use another number of reverse steps, and just set the maximum number of sampling steps in scheduling ("N") in BDDM.
- the noise predictor of FastDiff shares a similar mechanism as BDDM's, and thus the calculation of STOI and PESQ is required.
- Thanks, this $\tau$ is a typo, and the algorithm still remains
math.floor(T/tau)
. You could try it yourself: the higher $\tau$ is, the shorter the predicted inference schedule tends to be.