Heecheol Cho comments

Results 24 comments of


                                            Heecheol Cho

Why is the 1st frame of mel spectrum set to 0?

For causal cut, the last frame was cut off. To keep the length, the 1st frame was padded.

[suggestion] a simple LengthRegulator

@KinamSalad I wasn't thinking of a jit decoration. I only considered a simple implementation.

tf.layers.conv1d(dilations)

I simplified the code using tf.layers.conv1d. See the [Issue](https://github.com/ibab/tensorflow-wavenet/issues/370)

A great improvement has been made for master branch (LJSpeech)

@begeekmyfriend. Is begeekmyfriend's model changed from the original model? keithito's model has 3 GRU in the decoder. begeekmyfriend's model has 2 GRU in the decoder. Is it right? ![begeekmyfriend](https://user-images.githubusercontent.com/26861167/47276704-fe004580-d5f3-11e8-8118-8612e32e8150.png)

Neural Acoustic Models - ratsgo's speechbook

그림6에서 세번째 t=3 frame이라고 되어 있는 부분에서 질문있습니다. t=2 frame에서 'g'를 예측했다면, encoder time step의 증가 없이 t=2 frame이 한번 더 사용되어야 하지 않나요 (그림에서 붉은 화살표가 위로 향하는 상황)

MFCCs - ratsgo's speechbook

그림 4가 두번 있네요. 하나는 그림 5로 수정 필요.

MFCCs - ratsgo's speechbook

MFCC결과에서 첫번째 열벡터(log mel spectrogram의 합)를 버리는 이유를 아래 그림으로 이해해도 될 것 같습니다. (그림에서는 가로/세로가 바뀌어 있습니다. 행벡터로 보시면 됩니다.) ![MFCC2](https://user-images.githubusercontent.com/26861167/95292787-23ec8680-08ad-11eb-8b5d-e75c0fa31528.png) 왼쪽은 제일 아래쪽 라인의 값들이 너무 작어서(음수), 다른...

Heecheol Cho

Why is the 1st frame of mel spectrum set to 0?

[suggestion] a simple LengthRegulator

tf.layers.conv1d(dilations)

A great improvement has been made for master branch (LJSpeech)

Neural Acoustic Models - ratsgo's speechbook

MFCCs - ratsgo's speechbook

MFCCs - ratsgo's speechbook

Introduction - ratsgo's speechbook

Transducer -> recognize -> decode 관련 질문

Wavenet-Vocoder 훈련 속도 문제 문의 드립니다.