Maximiliano Levi
Maximiliano Levi
I can't reproduce it, does it happen always?
I am not an expert on Linux, but Mono is already integrated into the build so you just need to execute the .x86. Also some Linux users say they need...
@hishamhm Oh I see, I'll build a x64 version then. Thanks for the info!
Thank you for the fast response. Do you know if its a bug or an inherent limitation of current implementation?
@lfr-0531 Thank you for the reply. Currently I am testing llama 3.2 1B with the following command ``` trtllm-build --max_batch_size 8 --max_seq_len 1024 --max_multimodal_len 131072 --gpt_attention_plugin auto --gemm_plugin auto --model_cls_name...
So we just need to broadcast the second dimension when batch_size > 1 ? I can PR