Steve Lu

Results 4 comments of Steve Lu

Hi, The two issues may both be due to the implementation of tf.CuDNNLSTM. The reported performance is based on my manually implemented LSTM (as is implemented in generator.py, the older...

The sorting operation is hidden behind the the implementation of the Python ``map'' object. As it is implemented by a binary search tree, which automatically sort the inserted elements, the...

Yes, but they are of little help with this problem. The problem is that WGAN-GP has a term in its loss function which is the norm of the original model's...

I've encountered this bug too. After inspection, it feels to me the following implementation in the current stable release is related: ```python #In deepspeed/runtime/zero/stage_1_and_2.py def complete_grad_norm_calculation_for_cpu_offload(self, params): total_norm = 0.0...