Krystof Olik

Results 9 comments of Krystof Olik

Is it already possible to add a predefined custom system message for a model?

I get infinite loading when using gradio-grcalendar

That is one monster of a system prompt. Much larger than ChatGPT's system prompt.

Is there any way to use tensor parallelism with uneven amount of GPUs?

Don't use the CLI and use the transformers python way.

Unit tests should also be updated to support this change. But IMO it's a very useful feature. At least for me since llms always like to answer with: """ Here...

The postfix doens't work and breaks the actual program. upon further evaluation. I think postfix pruning is kind of impossible during the llm generation. Prefix is easily fixed though.

I have the same problem. I even tried modifying the source code of transformers whisper to make output_attentions=False and that still didn't work