0xd8b

Results 13 comments of 0xd8b

After the transformation of the T5 model is complete, with "remove input padding" set to false and a maximum batch size of 8, when the model inference is set to...

> @thanhlt998 fixed. It was due to missing cuda stream synchronization between encoder stream and decoder stream. The fix will be released in next week's weekly main branch update For...

@QiJune I encountered the same issue with the T5 model (float16). The inference results vary slightly with different batch sizes during extensive sample testing. Is this a normal phenomenon? I...