Branko Radovanović comments

Results 20 comments of


                                            Branko Radovanović

Add support for Mixtral 8x7B

So, #1819 has been merged and it landed in 2.6.2, but yesterday I've tried phi-2 (Q5_K_M, specifically) but it still doesn't work. I suppose it doesn't have upstream support yet....

Add support for Mixtral 8x7B

Mixtral 8x7B indeed works in the chat, but it doesn't work with Python bindings - I guess that's one last bit missing for full support.

High CPU usage in chat with Hermes 2 Pro Mistral 7B after generation has finished

This might be related to an issue with stop token, described in #2239.

High CPU usage in chat with Hermes 2 Pro Mistral 7B after generation has finished

Had the same problem on a couple of occasions while running [Coxcomb](https://huggingface.co/N8Programs/Coxcomb-GGUF). There are certainly more models that exhibit the same behavior, but whether that has something to do with...

Delete All Chats

Deleting multiple chats is annoying because e.g. if you have multiple chats, each using a different model, then selecting a chat (which you need to do in order to delete...

Delete All Chats

> #1660 #1660 > > that is old stuff ^^ Indeed it is, this is resolved in 2.6.2 and switching between chats is **much** faster now. This does not necessarily...

Potential feature: select and/or copy chat response to clipboard

I'm quite pleased with the way right click works in 2.7.4 (select all/copy). I believe this issue can be closed now.

Cannot download models smoothly

I was adding a model today, but the download got stuck, which apparently simply reset the download GUI to its initial state. Having no other option, I clicked on "Download"...

Cannot Even Complete A Model Downloading

Happened to me too today. Here is what I think is going on: the download did complete (after half a dozen or so restarts), the app then calculates the MD5...

High CPU usage in chat with Hermes 2 Pro Mistral 7B after generation has finished

> The stop token can be a list (see [llama 3](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/c4a54320a52ed5f88b7a2f84496903ea4ff07b45/generation_config.json#L3)), so ideally everyone would use that correctly when uploading HF models, and the llama.cpp conversion scripts would pick that...