BBC-Esq comments

Results 104 comments of


                                            BBC-Esq

BENCHmarking new flash attention!

I benched https://huggingface.co/cognitivecomputations/dolphin-llama2-7b: With Flash Attention: | Model | Beam Size | Tokens per Second | VRAM Usage (MB) | |----------------------------|-----------|-------------------|-----------------| | dolphin-llama2-7b-ct2-int8 | 1 | 39.43 | 10040.58 |...

BENCHmarking new flash attention!

More benchmarks...wanted to see if flash attention was better utilized when running in bfloat16, the model's original format, which it still doesn't benefit as much as mistral/solar/llama3... ## With Flash...

Problem converting Phi3-instruct-128k; "su" rope scaling in Phi-3

Did some additional legwork on this "su" scalilng and here's what I came up with...hope it helps, and hope that implementing it still allows someone to use the new flash...

Problem converting Phi3-instruct-128k; "su" rope scaling in Phi-3

Closing due to it successfully being implemented in release 4.3

Support for Zephyr and other "StableLmForCausalLM" models?

Here is yet another badass model @minhthuc2502 . Would love to help create a converter but am not an expert. It's the 1.6b version of Zephyr: [https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b](url) It kicks ass...

Support for Zephyr and other "StableLmForCausalLM" models?

To maybe save you a few minutes..I've gathered the following information for someone/anyone: 1) The ```config.json``` states that the architecture is "StableLmForCausalLM" 2) I think this is it [https://huggingface.co/docs/transformers/v4.40.0/en/model_doc/stablelm](url) 3)...

Anomalous T5 results using GPU inference on a 4090 graphics card

Wish I could help but it's all in Chinese...what exactly are you trying to do?

Anomalous T5 results using GPU inference on a 4090 graphics card

Sorry, thought I might help but not familiar with that model.

include/ctranslate2/ops/flash-attention/flash_fwd_launch_template.h(15): error: identifier "__grid_constant__" is undefined

Was this resolved? As I sit drinking my morning coffee reading about one of my favorite libraries, ctranslate2, I don't want to waste time reading about issues that have been...

use av library instead baby!

I get the same error message but it doesn't seem to affect the quality or speed of the transcriptions. It might have something to do with timestamps though..Thoughts? https://superuser.com/questions/1226305/ffmpeg-warning-timestamps-are-unset-in-a-packet-when-converting-h264-to-mp4 https://stackoverflow.com/questions/48927762/ffmpeg-timestamps-are-unset-in-a-packet-for-stream-0-non-monotonous-dts-in-outp