QI JUN

Results 75 comments of QI JUN

Yes, it's expected. The prompt tuning can not work with block_reuse now.

@kaiyux Could you please have a look? Thanks

cc @syuoni for vis, let's consider supporting gpqa task in the ongoing accuracy suite.

Hi @ttim , if my understanding is correct, the `gelu_pytorch_tanh` should be equal to `gelu` activation function, they are different implementation. Could you please share the error log when building...

@ttim , Yes, I think so. Could you please submit a MR to fix it? Or you prefer to waiting for us to fix it?

@Funatiq Could you please have a look? Thanks

/bot skip --comment "CI passed"