chenjiasheng
chenjiasheng
https://github.com/zzw922cn/Automatic_Speech_Recognition/blob/545a1981dbc705d6f8312650a9d5a290ee065f8a/models/deepSpeech2.py#L73 > As in (Laurent et al., 2015), there are two ways of applying > BatchNorm to the recurrent operation. A natural extension > is to insert a BatchNorm transformation,...
https://github.com/zzw922cn/Automatic_Speech_Recognition/blob/545a1981dbc705d6f8312650a9d5a290ee065f8a/models/deepSpeech2.py#L68
https://github.com/zzw922cn/Automatic_Speech_Recognition/blob/545a1981dbc705d6f8312650a9d5a290ee065f8a/models/deepSpeech2.py#L56 see https://github.com/PaddlePaddle/models/blob/develop/deep_speech_2/layer.py, `conv_group`
I found that training our model with a Tesla P100 GPU is not any faster than training it with a ordinary CPU. According to these console outputs, it seems TensorFlow...
Have you tried Wrap-CTC on GPUs? Do you know any clue about CTC working on multiple GPUs?
…from becoming a performance bottleneck during distributed training. # Before submitting - [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you...
## 🐛 Bug Function `sample_negatives` is the performance bottleneck of multi-GPU distributed training. This function randomly generates negative sample indexes. Several tensors, such as tszs and neg_idxs, are created in...
### Describe the bug A "reflect" padding mode tries to copy half of kernel size numbers of frames from the input, It fails when `input`'s `time_length` is smaller than half...
https://github.com/microsoft/LoRA/blob/7036ee01b45dd4d4eb708941ffe1b5b414a013d5/loralib/layers.py#L59C40-L59C40 ``` if hasattr(self, 'lora_A'): # initialize A the same way as the default for nn.Linear and B to zero nn.init.zeros_(self.lora_A) nn.init.normal_(self.lora_B) ``` It seems there is a misuse of...