iamsach

Results 2 comments of iamsach

The accelerate fix worked for me. And now the models are loaded in the GPU, after I passed in the LLM that I have instantiated with CTransformers to the prepare...

I used this @Jeevi10 `from accelerate import Accelerator from langchain.llms import CTransformers accelerator = Accelerator() config = {'max_new_tokens': 512, 'repetition_penalty': 1.1, 'context_length': 8000, 'temperature':0, 'gpu_layers':50} llm = CTransformers(model = "./codellama-7b.Q4_0.gguf",...