Jiarui Fang(方佳瑞)
Jiarui Fang(方佳瑞)
I guess the error comes from setting batch size as 1. If I set the per_device_train_batch_size as 2, It works.
> Thanks, Jiarui. It seems like an assertion bug for 1 batch, we'll fix it. BTW, Are Turbo working on training? I'm looking forward to it. Haha, thanks for your...
训练和推理应该没什么影响。huggingface它的新版本4.x.x和旧版本3.x.x的接口不一致了,所以我升级一下,都是4.x.x应该无所谓吧。
因为bert-base-uncased都有一个pooler层吧。你的模型如果没有可以去掉。
大模型推理可以去看一下NVIDIA的FasterTransformer。turbo支持也不难,如果你愿意做一些测试的话,我近半年没有计划去做这件事了。
尚未支持,在TODO list里。
Thanks for your attention to TurboTransformers. This is an open-source project licensed under the BSD 3-Clause License. You can modify the code and use it in your own commercial software...
可以看一下这个issue https://github.com/Tencent/TurboTransformers/issues/70
可能因为你的pytorch的张量都还在device_id=0上
可能是输入太短?用onnxrt有加速么?