Dev Goel
Dev Goel
same `nvcc -v` and `python -c 'import torch; print(torch.version.cuda);'` return same cuda version 11.8 ``` nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda...
thanks it solved the problem
Hey @nornak i am using UE5.2 and i am also trying to import VRM4U on My M1 Mac mini. i am able to build libassimb.a using terminal cmake but my...
@Gemini321 #12 i just go through this issue and solved inputRecorder issue but stuck on another problem after that
anyone solved all the problem. i am getting all the problem discussed in this thread
@jamestwhedbee @lopuhin i stuck on this Traceback (most recent call last): File "quantize.py", line 614, in quantize(args.checkpoint_path, args.model_name, args.mode, args.groupsize, args.calibration_tasks, args.calibration_limit, args.calibration_seq_length, args.pad_calibration_inputs, args.percdamp, args.blocksize, args.label) File "quantize.py", line...
@lopuhin i am running it on A100 , python 3.8 , with cuda 11.8 nightly so i think it is not about lower compute capability
i am doing same fp8 quantization but on llama-2 34b model and using 4xH100 and i am facing the same issue as well
@RonanKMcGovern For your first point. TensorRT LLM support Different LLM architecture and you can use [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) and its example code to convert safetensors model to tensorRT engine file which can...
@RonanKMcGovern you can use https://github.com/npuichigo/openai_trtllm , it is a wrapper to create openai compatible api for tensorRT-LLM