Xiaohui Wang
Xiaohui Wang
Currently not supported. The search module and decoder are coupled together now and we are developing a separate search module. Expected to take one month
You run native_fairseq_wmt14en2de.sh and ls_fairseq_wmt14en2de.sh under https://github.com/bytedance/lightseq/tree/master/examples/training/fairseq to test if there are any problems.
There are no changes of NCCL communication in lightseq. I guess it may be a conflict between your fairseq and nccl versions
Post-normalized is supported by post_ln=1. The comments "though post_ln=1 ..." is out of date.
Not yet, we will test it later, thanks
We will support it within a month
For language model scoring like ppl, it will be ok. For generation, there may be problems caused by OOM of gpu memory. You can fill in zeros for position embedding
You can build the dynamic link library from source. A tutorial is available here: https://github.com/bytedance/lightseq/blob/master/docs/inference/build.md
Can you reproduce our result on the wmt14 en-de dataset on your hardware and environment?https://github.com/bytedance/lightseq/blob/master/examples/training/fairseq/ls_fairseq_wmt14en2de.sh
Maybe your GPU doesn't support tensorcore for fp16, you can try to build LightSeq with fp32 mode: ENABLE_FP32=1 pip3 install -e $PROJECT_DI