QI JUN
QI JUN
Yes, it's expected. The prompt tuning can not work with block_reuse now.
@kaiyux Could you please have a look? Thanks
cc @syuoni for vis, let's consider supporting gpqa task in the ongoing accuracy suite.
Hi @ttim , if my understanding is correct, the `gelu_pytorch_tanh` should be equal to `gelu` activation function, they are different implementation. Could you please share the error log when building...
@ttim , Yes, I think so. Could you please submit a MR to fix it? Or you prefer to waiting for us to fix it?
@Funatiq Could you please have a look? Thanks
@Tracin Could you please have a look? Thanks
/bot run --add-multi-gpu-test
/bot run
/bot skip --comment "CI passed"