taoli comments

Results 6 comments of


                                            taoli

Fix mshadow broken basic_stream example

> Any tests to confirm the change? @ChaiBapchya Tested with the following edited config.mk. The sample only compiled when `USE_CUDA=1`. Since not all user use CUDA, I didn't make it...

"peer access is not supported between these two devices" when using multiple GPUs

Pls try `use_custom_all_reduce` as false

Build TensorRT-LLM occur error without container

> > I got the same error without container > > I solve this problem by using cuda12.2 and openmpi. Glad your issue is solved. Pls try use container. Since...

Why does the 'loss=nan' appear when finetuning model using fp16 or bf16 mixed precision in 'fine_tune.py' ?

It's a known issue of xformers library on 30x GPUs, try install the dev version of the xformers. This works on my 3060. ``` pip install xformers==0.0.17.dev466 ``` See https://github.com/facebookresearch/xformers/issues/628

Enhanced Efficiency in TRT-LLM through Caching of Engines

Hi @Lokiiiiii Thanks for the comments and proposal. Since TRT-LLM evolves so quickly, adding features and optimization techniques on almost every release. The current focus is trying to provide best...

how to use trt_llm to accelerate original llava-liuhaotian/llava-v1.5-7b?

@ganliqiang could you use the hugging face checkpoint for this model? The hugging face model is supported and tested. TRT-LLM needs to reads the hf_config.architectures to make sure the TRT-LLM...