FasterTransformer
FasterTransformer copied to clipboard
Does FasterTransformer support multi-stream pipeline parallelism ?
Hello guys:
Because there is no dependence on computation and communication, I think multi-stream pipeline parallelism can hide communication time to improve performance.
I didn't find how to configure the multi-stream feature in the code. Can anyone help? Thank you so much~
ftNcclRecv(sequence_lengths_ + id_offset, local_batch_size * beam_width, pipeline_para_.world_size_ - 1, pipeline_para_, stream_);