Overman

Results 4 comments of Overman

On my laptop with 16GB RAM + 16GB VRAM, @hamidahmadian's solution allows me to load the models, but still gives me an OOM error when doing merge_and_unload(). However, the following...

> You need to provide `CUDA_DOCKER_ARCH` to make it work. E.g., `make LLAMA_CUBLAS=1 CUDA_DOCKER_ARCH=sm_87 LLAMA_CUDA_F16=1 -j 10` for Jetson Orin or `make LLAMA_CUBLAS=1 CUDA_DOCKER_ARCH=sm_72 LLAMA_CUDA_F16=1 -j 10` for Jetson Xavier....

@vvsotnikov I am indeed still on CUDA 10. It's great to hear that upgrading to CUDA 11 is possible on the Xavier NX from someone who has actually done it!...

I can confirm that after updating to JetPack 5.1.2 and using the command ```make LLAMA_CUBLAS=1 CUDA_DOCKER_ARCH=sm_72 LLAMA_CUDA_F16=1 -j 10``` provided by @vvsotnikov I can successfully compile and run the newest...