Xiaohui Wang

Results 28 comments of Xiaohui Wang

Thanks, we will provide distributed version for our training example. Currently, we have not considered multi-node training

Thank you for the information, maybe in 2-4 months.

You can try to reinstall protobuf and make sure to set CXXFLAGS: curl -O -L -C - https://github.com/protocolbuffers/protobuf/releases/download/v3.13.0/protobuf-cpp-3.13.0.tar.gz tar xf protobuf-cpp-3.13.0.tar.gz cd protobuf-3.13.0 && ./autogen.sh ./configure "CFLAGS=-fPIC" "CXXFLAGS=-fPIC" make -j...

In fact, the calculation process of lightseq is the method (1) you describe. Specifically, qkv_w in lightseq is a 1-D array in format [q_1 k_1 v_1 ... q_i, k_i, v_i,...

Not yet, we will evaluate the priority of t5/mt5. Could you tell us about your usage? e.g. pretrain/finetune/inference

This may not be a loss of accuracy. Maybe you need to check if your model structure is the same as bert. Also, check some options like pre-ln or post-ln...

You can try to solve it by installing LightSeq with: git clone https://github.com/bytedance/lightseq.git cd lightseq pip install -e .

Not supported yet.

There is no update yet to compare with the latest tensorrt and fastertransformer, we will update it in two months LightSeq does not currently support large models