Jiarui Fang（方佳瑞） comments

Results 220 comments of


                                            Jiarui Fang（方佳瑞）

想问下turbo支持huggingface的bart模型么

例子里应该都处理了吧，encoder和bert类似，改起来很简单。

Set num threads ineffective

In your command line set ` export OMP_NUM_THREADS=? `

Set num threads ineffective

If you are using onnxrt as backend. The multiple-threading is not managed by OMP env. export MKL_VERBOSE=1. See the number of threads used for GEMM.

albert和Roberta有c++的推理示例吗

没有，因为它们的embedding和bert不一样，所以他们的embedding没有用C++实现，用的pytorch的layer，如果你把它的embedding实现了，就可以全部用C++推理了。

Can turboTransformers be used for the ViT architecutre?

Sorry, I am not familiar with the ViT arch. It has been used in both Encoder and Decoder Transformer architecture, which mainly focused on NLP tasks. If you develop your...

About "docker build"

You can manually install the requiremenst in the dockefile on your server.

cuda error in Transformer decoder in TurboTransformer V0.3.0

Hi, could you update your version to the lastest version v0.4.2? If you still need v0.3.0 support, can you give a commit hash id? I am willing to help you...

cuda error in Transformer decoder in TurboTransformer V0.3.0

Hi, I got it. Can you paste your build commands as well as the run scripts which may help me to reproduce the error? I guess you are working on...

Turbo slower than Torch on V100

Thanks for your report. Compared with my previous results. The QSP of both torch and turbo from your screenshot are not correct. According to a 0418 version, when seq_len 100,...

Turbo slower than Torch on V100

I have no V100 on hand. Could you please try our previous commit and check if the benchmark results. git reset --hard 64dd569da9ce8bf1f78fcd108356607371b742ed