Xiaohui Wang comments

Results 28 comments of


                                            Xiaohui Wang

你好，当bert 采用 UNILM 方案做文本生成的时候，请问lightseq可以提供 beamsearch的加速功能嘛？

Currently not supported. The search module and decoder are coupled together now and we are developing a separate search module. Expected to take one month

nccl problem when using lightseq for fairseq multi-gpus training

You run native_fairseq_wmt14en2de.sh and ls_fairseq_wmt14en2de.sh under https://github.com/bytedance/lightseq/tree/master/examples/training/fairseq to test if there are any problems.

nccl problem when using lightseq for fairseq multi-gpus training

There are no changes of NCCL communication in lightseq. I guess it may be a conflict between your fairseq and nccl versions

Is post_ln=1 supported for inference?

Post-normalized is supported by post_ln=1. The comments "though post_ln=1 ..." is out of date.

did lightseq compare with marian ?

Not yet, we will test it later, thanks

undefined symbol errors for libtransformer_server.so

We will support it within a month

[Inference] Support GPT-J-6B

For language model scoring like ppl, it will be ok. For generation, there may be problems caused by OOM of gpu memory. You can fill in zeros for position embedding

How to prepare a pre-built library to run the end-to-end model server based on TRTIS

You can build the dynamic link library from source. A tutorial is available here: https://github.com/bytedance/lightseq/blob/master/docs/inference/build.md

Model trained by lightseq perform worse than model trained by fairseq

Can you reproduce our result on the wmt14 en-de dataset on your hardware and environment？https://github.com/bytedance/lightseq/blob/master/examples/training/fairseq/ls_fairseq_wmt14en2de.sh

为什么转换后的HDF5模型，推理时间反而比Hugging Face慢？

Maybe your GPU doesn't support tensorcore for fp16, you can try to build LightSeq with fp32 mode: ENABLE_FP32=1 pip3 install -e $PROJECT_DI