BBC-Esq comments

Results 104 comments of


                                            BBC-Esq

[WIP] Extract PDF References

Interested in this as an attorney and extracting numerous legal citations...

can it use bitsandbytes, better transformer, and/or flash attention 2?

Quick question for you sir. You realize that flash attention 2 only works certain gpus, correct? If a person tries to run the model without the required GPU, will the...

Benchmarks of parler-tts, the emergence of TTS!

Hey @sanchit-gandhi here's updated comparisons. Feel free to let me know how to cast in float16/bfloat16 if you want and/or use bitsandbytes or whatever this type of model is compatible...

Benchmarks of parler-tts, the emergence of TTS!

The WhisperSpeech library uses two types of models, s2a and t2s and there are multiples of each, so this benchmark tests every permutation/combination.

Benchmarks of parler-tts, the emergence of TTS!

Awesome, thanks dude! Don't know why I didn't realize that. lol. Anyways, here's the updated bench. About the same processing time but about 30% less VRAM used. At a certain...

Benchmarks of parler-tts, the emergence of TTS!

@ylacombe here's the updated benchmark to include the new Large model, which, congratulations on BTW to Huggingface. A quick disclaimer... I'm giving two charts - one showing vram usage and...

[SOLVED] Running Llama3 with Ctranslate2

OK, let me retry it...thanks.

[SOLVED] Running Llama3 with Ctranslate2

Strange...it did the same thing again. Below I am including (1) the full response, (2) the command I used to run the script, (3) a modified script I created that...

[SOLVED] Running Llama3 with Ctranslate2

I solved the issue by using the "end_token" parameter. Here's the script for peoples' benefit: ``` class Llama38BInstructModel: def __init__(self, user_prompt="PLACEHOLDER_FOR_USER_PROMPT", system_prompt="You are a helpful assistant who answers questions in...

Whisper batch generation is not faster than loops

Absolutely! So glad you asked! lol. Ctranslate2 actually does support true batching, but at the C++ level. I'll give you my repository that uses it via the amazing ```WhisperS2T``` as...