ctransformers issues

Support for cuda 11.8 and above

This change allows CUDA 11.8 users to use ctransformers with GPU support. #90 #170 #139

Cuda 11.8

1

Hello, from what I see and understand from the CMakeLists.txt, the current version is compatible with cuda 12.0 and greater. Would it be safe if I compile using 11.8? Thanks!

JeanChristopheMorinPerso

Hello, I installed the ctransformers library using *pip install ctransformers --no-binary ctransformers* due to glib version error: _version _GLIBC_2.29_ not found_ in linux. Now i have to use lib file...

khanjandharaiya

Loading local QPTQ LLM from safetensors

2

I'm trying to load TheBloke/Llama-2-7b-Chat-GPTQ from the local directory with the sample code provided here: ``` from ctransformers import AutoModelForCausalLM llm = AutoModelForCausalLM.from_pretrained("./my_folder/TheBloke/Llama-2-7B-GPTQ", model_type="gptq") ``` but it seems like `.from_pretrained`...

tmwstw7

Requesting support for 'CausalLM' models

A new set of 7B & 14B models have come up, and they are claiming that, the 7B model is better than all of the existing 33B models and the...

SixftOne

Integrating with outlines

5

I'd love to use ctransformers with the [outlines](https://github.com/normal-computing/outlines) library for constrained generation. I opened this [issue ](https://github.com/normal-computing/outlines/issues/225) about it on their repo. In order to hack away at this integration,...

harryjulian

Streaming decode issue

3

Hello, for llama when decoding Chinese or Japanese characters, since one character mgith need 2 or more tokens to decode, so when streaming, the chunk returned one token decode result...

lucasjinreal

How to compute logits output in parallel for all the input sequence?

2

This is the code I run: ```python from ctransformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("TheBloke/LLaMa-7B-GGML", hf=True) tokenizer = AutoTokenizer.from_pretrained(model) tokens = tokenizer.encode("Hello world! What's up?") output = model(tokens[None,:]) ``` I...

djmMax

Support Microsoft Guidance

16

I am trying to use a 'custom tokenizer' but I am unable to see how can I invoke it. Also can we use a standard tokenizer from HF by pulling...

vmajor

logprobs are greater than 0

for llama model .eval(), I am getting llm.logits probs that are greater than 0...which is wrong since they are supposed to be logprobs ``` 00000 = {float} 1.6911218166351318 00001 =...

RevanthRameshkumar

ctransformers
ctransformers copied to clipboard

Metadata

Support for cuda 11.8 and above

Cuda 11.8

Can't use AVX2 lib in Linux.

Loading local QPTQ LLM from safetensors

Requesting support for 'CausalLM' models

Integrating with outlines

Streaming decode issue

How to compute logits output in parallel for all the input sequence?

Support Microsoft Guidance

logprobs are greater than 0

← Metadata

Owner

Metadata

ctransformers ctransformers copied to clipboard

Metadata

← Metadata

Owner

Metadata

ctransformers
ctransformers copied to clipboard