ctransformers icon indicating copy to clipboard operation
ctransformers copied to clipboard

Python bindings for the Transformer models implemented in C/C++ using GGML library.

Results 106 ctransformers issues
Sort by recently updated
recently updated
newest added

This change allows CUDA 11.8 users to use ctransformers with GPU support. #90 #170 #139

Hello, from what I see and understand from the CMakeLists.txt, the current version is compatible with cuda 12.0 and greater. Would it be safe if I compile using 11.8? Thanks!

Hello, I installed the ctransformers library using *pip install ctransformers --no-binary ctransformers* due to glib version error: _version _GLIBC_2.29_ not found_ in linux. Now i have to use lib file...

I'm trying to load TheBloke/Llama-2-7b-Chat-GPTQ from the local directory with the sample code provided here: ``` from ctransformers import AutoModelForCausalLM llm = AutoModelForCausalLM.from_pretrained("./my_folder/TheBloke/Llama-2-7B-GPTQ", model_type="gptq") ``` but it seems like `.from_pretrained`...

A new set of 7B & 14B models have come up, and they are claiming that, the 7B model is better than all of the existing 33B models and the...

I'd love to use ctransformers with the [outlines](https://github.com/normal-computing/outlines) library for constrained generation. I opened this [issue ](https://github.com/normal-computing/outlines/issues/225) about it on their repo. In order to hack away at this integration,...

Hello, for llama when decoding Chinese or Japanese characters, since one character mgith need 2 or more tokens to decode, so when streaming, the chunk returned one token decode result...

This is the code I run: ```python from ctransformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("TheBloke/LLaMa-7B-GGML", hf=True) tokenizer = AutoTokenizer.from_pretrained(model) tokens = tokenizer.encode("Hello world! What's up?") output = model(tokens[None,:]) ``` I...

I am trying to use a 'custom tokenizer' but I am unable to see how can I invoke it. Also can we use a standard tokenizer from HF by pulling...

for llama model .eval(), I am getting llm.logits probs that are greater than 0...which is wrong since they are supposed to be logprobs ``` 00000 = {float} 1.6911218166351318 00001 =...