Tianjian Meng

Results 3 comments of Tianjian Meng

You can try to manually set the grad of fixed norms to None, it should solve this problem.

Sometimes DataParallel module in PyTorch shows some random behavior and could be very slow, so we would recommend you to use distributed training instead.

Currently I don't have the computation resource to reproduce the result. But in my previous experiment, the final result (training the searched architecture using the official TensorFlow implementation) was on...