LiPengtao comments

Results 4 comments of


                                            LiPengtao

How could we accelerate NLLB, can we use TensorRT or vLLM?

How to accelerate NLLB

NLLB is unable to translate into a complete long sentence in Chinese.

I also encountered this problem. Src:"We now have 4-month-old mice that are non-diabetic that used to be diabetic," he added. Tgt:他补充道：“我们现在有4个月大没有糖尿病的老鼠，但它们曾经得过该病。” Predict:他补充说:"我们现在有4个月的小鼠,

How to convert NLLB models to ONNX format and quantify them in INT4

Yes, I am very interested in knowing how to perform certain optimizations to reduce RAM consumption and use kv cache, as well as how to speed up inference. Also, what...

How to convert NLLB models to ONNX format and quantify them in INT4

I exported three models of the NLLB model using onnx, which are decoderer_with_cst_madel_quantified. onnx decoder_model_quantized.onnx encoder_model_quantized.onnx， I'm not sure how to use onnx runtime for inference, and which variables make...