CTranslate2
CTranslate2 copied to clipboard
Fast inference engine for Transformer models
ChatGLM is a popular ChatGPT-like model in Chinese: https://github.com/THUDM/ChatGLM-6B Could ct2 support ChatGLM, and speed up the inference. Thanks a lot.
On my computer (MacBook Pro 14", 2021 with M1 Pro, 16GB RAM, Ventura 13.3.1), this code causes `python` to segfault: ```python import ctranslate2 import numpy as np ``` The fix...
Comments and feedback welcome. Figured this would help users if they wanted to get started.
Hello, I noticed that when supplying a target_prefix to the translate_batch or generate_tokens method, the latencies for generating the supplied tokens is equivalent to the situation where they are not...
I ran the new Llama3 sample script and it seems to be conversing with itself so I think there's a problem with how the prompt is being constructed...See below: ```...
Hello, thank you for a great project! I am getting this error when using ALiBi or RoPe positional encoding in a tranformer NMT model from OpenNMT-py: KeyError: 'encoder.embeddings.make_embedding.pe.pe' Absolute and...
Hello peeps, it's me again. The new converter works great with Phi3 but doesn't work with the 128k version located here: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct After much chagrin, I had a scintillating conversation...
I am using fairseq transform align and Im trying to get the alignment results with no success. this is possible somehow?
Hi! This has been an awesome project and helped me a lot. I was wondering can we get dynamic lora switching in ctranslate2? Afaik TensorRTLLM has it