kwrobel.eth comments

Results 59 comments of


                                            kwrobel.eth

Speed up inference problems

I have found exact place: https://github.com/EleutherAI/lm-evaluation-harness/blob/e9d429e105fa95dd4a1b5606b306289d207fcf62/lm_eval/models/huggingface.py#L1049 and replicated with minimal code (I get the same numbers in this line). Model loaded on CPU with bfloat16 gives the same numbers: ```...

Speed up inference problems

Why do you think it is a problem with model implementation? But yes, it is not related to lm-evaluation-harness repository. Maybe it is some GPU optimization (cuBLAS?).

Speed up inference problems

The same issue with `meta-llama/Llama-2-7b-chat-hf`. Maybe it is resolved in new cuBLAS: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cublas-release-12-3-update-1 ? I am using CUDA 12.1, cuBLAS 12.1.3.1

Does it support async?

Thank you! It would be helpful to support it.

feat: Hugging Face Pipeline as a new Chat Model

What do you mean? It is still not implemented.

Be able to generate random samples in `gr.Examples` or `examples` in `gr.ChatInterface`

@abidlabs Thanks. But it doesn't work correctly. After clicking an example button different text is provided. You can check here: https://huggingface.co/spaces/speakleash/Bielik-7B-Instruct-v0.1 ![image](https://github.com/user-attachments/assets/b3808edd-2102-48b0-9f8e-1116dc74436c)

mermaid is not work in slideshow

It would be very helpful to support mermaid.

How to properly run model training on 1 RTX4090 graphics card?

speakleash/Bielik-7B-Instruct-v0.1 supports system prompt, so the problem must be with data: "Conversation roles must alternate user/assistant/user/assistant/..." You can't apply chat template for `conversation=[{"role": "assistant", "content": r"%%%%%%%%%%%%%%%%"}],` because before assistant role...

feat: [sc-24365] ens-normalize add "ignored" characters to disallowed sequence

@Carbon225 Do you think this is ready?

Fix partial caching of openai models

I don't understand what it is about. However, now caching is not working with `openai-completions`.