Saish Reddy Komalla

Results 1 comments of Saish Reddy Komalla

Maybe out of context of the original issue, but is there any concrete reason for suggesting dim_feedforward to be 4* hidden_dim? Can they be same? as in the paper, using...