Minh-Thuc

Results 86 comments of Minh-Thuc

In the method ``generate_tokens``, there is an option ``return_log_prob``. It is ``false`` by default. You can activate it and get the ``log_prob`` in the result by ``result.log_prob``

Hello, I will include the log prob for the whole vocabulary.

You are using ``sampling_topk = 1``, in this case, ``random sampler`` is used and we don't use ``sampling_temperature`` to randomize the sample (``best sampler`` is affected by ``sampling_temperature `` instead)....

How do you get the time of the result for the first 1000 samples ? Normally, when 100 000 samples are passed in the ``for-loop 1``, the first 1000 samples...

Currently, Ctranslate2 do not support ``DirectML``. To support this, the new implementation for this backend is required.

Actually, we don't have plan to support this model yet, it'll be in the backlog for the future

Ctranslate2 supports soon the flash attention 2 following this PR #1651. I will do the release asap. I made some tests and saw an improvement in performance with long prompt....

> This is great! Any chance you could provide some tips as to how to test this on faster-whisper? Make sure you have Ampere GPU or newer. You can just...

Hello, I did not make a benchmark with Faster Whisper, but there is some benchmark for Flash Attention with some LLM models [here](https://github.com/OpenNMT/CTranslate2/issues/1676).

> Thank you for your attempt to help! 😄 I will post this question directly in the `faster-whisper` repo while waiting for @minhthuc2502 's response. With recent tests, I posted...