BBC-Esq

Results 104 comments of BBC-Esq

I benched https://huggingface.co/cognitivecomputations/dolphin-llama2-7b: With Flash Attention: | Model | Beam Size | Tokens per Second | VRAM Usage (MB) | |----------------------------|-----------|-------------------|-----------------| | dolphin-llama2-7b-ct2-int8 | 1 | 39.43 | 10040.58 |...

More benchmarks...wanted to see if flash attention was better utilized when running in bfloat16, the model's original format, which it still doesn't benefit as much as mistral/solar/llama3... ## With Flash...

Did some additional legwork on this "su" scalilng and here's what I came up with...hope it helps, and hope that implementing it still allows someone to use the new flash...

Closing due to it successfully being implemented in release 4.3

Here is yet another badass model @minhthuc2502 . Would love to help create a converter but am not an expert. It's the 1.6b version of Zephyr: [https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b](url) It kicks ass...

To maybe save you a few minutes..I've gathered the following information for someone/anyone: 1) The ```config.json``` states that the architecture is "StableLmForCausalLM" 2) I think this is it [https://huggingface.co/docs/transformers/v4.40.0/en/model_doc/stablelm](url) 3)...

Wish I could help but it's all in Chinese...what exactly are you trying to do?

Sorry, thought I might help but not familiar with that model.

Was this resolved? As I sit drinking my morning coffee reading about one of my favorite libraries, ctranslate2, I don't want to waste time reading about issues that have been...

I get the same error message but it doesn't seem to affect the quality or speed of the transcriptions. It might have something to do with timestamps though..Thoughts? https://superuser.com/questions/1226305/ffmpeg-warning-timestamps-are-unset-in-a-packet-when-converting-h264-to-mp4 https://stackoverflow.com/questions/48927762/ffmpeg-timestamps-are-unset-in-a-packet-for-stream-0-non-monotonous-dts-in-outp