CHesketh76 comments

Results 18 comments of


                                            CHesketh76

Bug: Model not Unloading from RAM or vRAM on Linux

On Linux it happens 100% of the time.

Bug: Model not Unloading from RAM or vRAM on Linux

Add-on: This is not just llama.cpp and GGUF files, but also GTPQ files with Transformers. Most of the vRam is able to be freed up when unloading using transformers, but...

Bug: Model not Unloading from RAM or vRAM on Linux

I could not find a solution to this issue with Ubuntu, but after running this on Fedora 39 the model weights do unload from the RAM and VRAM unload as...

Missing mmproj?

Oh so this I can just remove that if I am not using a multi-model? Also, could I combine the llava with my mistral model to create a multi-model?

fix: update layer_norm_epsilon in phi_1_5 tp layer_norm_eps

Was this resolved? I am also getting the same error when running this: ``` python -m vllm.entrypoints.openai.api_server --model TheBloke/phi-2-GPTQ --quantization gptq --trust-remote-code ```

bug: TypeError attribute name must be string, not 'NoneType'

Nope, not using OpenLLM and switching to vLLM was my only fix

Support for Mistral

> Should work already, I've gotten it working with: > > ```python > from ctransformers import AutoModelForCausalLM > > # Set gpu_layers to the number of layers to offload to...

transformers 4.34 caused NotImplementedError when calling CTransformersTokenizer(PreTrainedTokenizer)

Issue can be created with this code from the readme ``` from ctransformers import AutoModelForCausalLM llm = AutoModelForCausalLM.from_pretrained("/path/to/ggml-model.bin", model_type="gpt2", hf=True) print(llm("AI is going to")) ``` or in https://colab.research.google.com/drive/1GMhYMUAv_TyZkpfvUI1NirM8-9mCXQyL. Hope this...

transformers 4.34 caused NotImplementedError when calling CTransformersTokenizer(PreTrainedTokenizer)

I spent all day trying to get Mistral working with ctranformers, but it is returning garbage text on my end. I believe it may be the tokenizer because ```tokenizer =...

transformers 4.34 caused NotImplementedError when calling CTransformersTokenizer(PreTrainedTokenizer)

Are you able to use the ```model.generate(...)``` I have got everything to run until I start generating text, it will just run indefinitely.