ctransformers issues

REST API?

12

What's the best practice to interface ctransformers API and expose it as REST API? I looked at https://github.com/jquesnelle/transformers-openai-api and tried to change it to use ctransformer, but then I stopped...

Spiritdude

WizardCoder-Python-34b GGUF

This may not be a problem with ctransformers per se. If I have made an error posting here, please feel free to close this post haste and I apologize for...

MichaelMartinez

Can i run on windows?

5

I tried to run on windows and i get this error ``` ERROR: Traceback (most recent call last): File "C:\streamlit-llama\venv\lib\site-packages\starlette\routing.py", line 677, in lifespan async with self.lifespan_context(app) as maybe_state: File...

talhaanwarch

Metal support for Replit Code model?

Hello, is there a plan to add Metal support for Replit code model? Thanks!

m1chae1bx

Hello, how to implement the chat

I using chat prompt send to llm(query), but the generate result can not stop. Am using codellama, does there any chat example to reference?

lucasjinreal

Peft models support

I've trained Dolly-v2 model using peft(qlora) and i got new file. it gets too much time to marge it. and some times it's not working. can you add peft models...

zaanind

Multiple GPU support

I have 3 GPTQ models to consume and I have 4 GPUs available. How can I mention which model to load in which GPU? If I do not mention it...

physaikat

QuIP support

Quantization with Incoherence Processing (QuIP) [code](https://github.com/jerry-chee/QuIP) has been released along with the paper. The LDLQ quantization algorithm described in the paper has been implemented, and it is built on top...

loretoparisi

May not even be a Transformers issue.. WizardLM-Uncensored-Falcon-40

1

Just could use some feedback on debugging with ctransformers, have a strange case where things are generally working, but occasionally I don't get output... using /models/WizardLM-Uncensored-Falcon-40b/ggml-model-falcon-40b-wizardlm-qt_k5.bin (GGML) ``` tokens =...

linuxmagic-mp

falcon.cpp: tensor 'lm_head.weight' is missing from model

2

Whenever I try to load a QLoRA-merged Falcon 40B model, the error below happens. `error loading model: falcon.cpp: tensor 'lm_head.weight' is missing from model` The hacky way I did to...

lppllppl920

ctransformers
ctransformers copied to clipboard

Metadata

REST API?

WizardCoder-Python-34b GGUF

Can i run on windows?

Metal support for Replit Code model?

Hello, how to implement the chat

Peft models support

Multiple GPU support

QuIP support

May not even be a Transformers issue.. WizardLM-Uncensored-Falcon-40

falcon.cpp: tensor 'lm_head.weight' is missing from model

← Metadata

Owner

Metadata

ctransformers ctransformers copied to clipboard

Metadata

← Metadata

Owner

Metadata

ctransformers
ctransformers copied to clipboard