AlpinDale
AlpinDale
#868 partially solved this issue. DRY is a lot faster now, but not as fast as other samplers. I think we can close this issue once a new release is...
That's a bit odd, which version are you using? FYI custom token bans were disabled between 0.6.0 and 0.6.1.post1, and enabled only on 0.6.2 from #751 If you're on the...
I'll enable it very soon, thanks for reminding!
@BlairSadewitz I was looking to adding this, but I found out it's already enabled. Can you check again?
This file is copied from [koboldcpp](https://github.com/LostRuins/koboldcpp/blob/concedo/klite.embd). I recommend sending a patch there. We'll eventually pull the fix once we update the klite embed file.
> How is this effort going? Would love to experiment with this on my AI server w/ Tensor Parallelism enabled. I've been busy with #769 so not much time to...
Doesn't seem particularly difficult to maintain, so go ahead!
Yes you can, it's just much more difficult without docker. Are the machines physically connected to each other? If they're not, I'd recommend not doing it - it would be...
That script is outdated, I forgot to remove it. We've been doing implicit conversion of GGUF models for a very long time.
I've been trying to replicate this, with some success: `DolphinPod/dolphin-2.9.1-llama3.1-8b`: ```json { "id": "chat-93aaefa981c046d497a5699308ad094d", "object": "chat.completion", "created": 1729716396, "model": "DolphinPod/dolphin-2.9.1-llama3.1-8b", "choices": [ { "index": 0, "message": { "role": "assistant", "content":...