tendar
Results
1
comments of
tendar
In other words, the second place is that CUDA_CALL(cudaStreamSynchronize(0)) should move to beind of src_op->SetDataSource(tl_gpu_, cudaStream_t(0)), not before the code of src_op->SetDataSource(tl_gpu_, cudaStream_t(0)).