CTranslate2 icon indicating copy to clipboard operation
CTranslate2 copied to clipboard

Fast inference engine for Transformer models

Results 173 CTranslate2 issues
Sort by recently updated
recently updated
newest added

Convert the model (currently support opennmt-py) and save to the memory => inference without saving model. Customizing the wrapper to save the memory used. TODO: implementation for other converters

Is MPS support on the roadmap? I wanted to use faster-whisper on my Mac computer, but it is only using CPU.

enhancement

Excellent work! Do you support the Qwen-1_8B-Chat model? Looking forward to your reply.

Hi, is there any way to extract the last hidden state (before lm_head dense layer) of the T5, GPT model ?. There are some kind of model need to take...

enhancement

I've been examining the encoded output of whisper and I see that the results are different when the same input is sent in via batch or one-by-one. I made a...

Operaring system: Ubuntu 22.04.2, Python 3.10.6, CTranslate2 3.16. When exporting the bigscience/bloomz model using: _ct2-transformers-converter --force --model bigscience/bloomz --output_dir bloomz --quantization float16_ The conversion process works well for other bloom...

bug

This PR adds quantized `Conv1D` inference on top of #1597. With previous `int8` quantization implementation, this quantized inference couldn't bring any speed up because quantization was bottleneck. To alleviate that,...

1. initial_prompt I use convert offical whisper model to CTranslate2 format,I can use “initial_prompt” normally. I convert my finetuned whisper model to CTranslate2 format, when i use “initial_prompt”, I get...

This PR adds armv7 support: * Implement slower generic versions of `div`, `mul_add`, `reduce_add`, `reduce_max`. * Rename `CT2_ARM64_BUILD` as `CT2_ARM_BUILD` Tested this on Galaxy S21 with library built for `armv7`,...

Hi, CTranslate2 uses oneDNN. oneDNN latest versions has [support for AMD GPU](https://github.com/oneapi-src/oneDNN/tree/master/src/gpu/amd). It [require Intel oneAPI DPC++](https://developer.codeplay.com/products/oneapi/amd/2023.0.0/guides/get-started-guide-amd). The same approach can [potentially enable NVIDIA GPU](https://developer.codeplay.com/products/oneapi/nvidia/2023.0.0/guides/get-started-guide-nvidia) support too. It would help...

enhancement
help wanted