AlpinDale comments

Results 170 comments of


                                            AlpinDale

[Bug]: Generation sometimes slows to a crawl for all requests when there is a DRY sampler request

#868 partially solved this issue. DRY is a lot faster now, but not as fast as other samplers. I think we can close this issue once a new release is...

[Bug]: Banned EOS_TOKEN still stopping generation

That's a bit odd, which version are you using? FYI custom token bans were disabled between 0.6.0 and 0.6.1.post1, and enabled only on 0.6.2 from #751 If you're on the...

[Feature]: xtc sampling support for kai api

I'll enable it very soon, thanks for reminding!

[Feature]: xtc sampling support for kai api

@BlairSadewitz I was looking to adding this, but I found out it's already enabled. Can you check again?

chore: update klite.embd

This file is copied from [koboldcpp](https://github.com/LostRuins/koboldcpp/blob/concedo/klite.embd). I recommend sending a patch there. We'll eventually pull the fix once we update the klite embed file.

feat: add shrek sampler (entropy)

> How is this effort going? Would love to experiment with this on my AI server w/ Tensor Parallelism enabled. I've been busy with #769 so not much time to...

[Feature]: pass-through parameter from request to model.forward (already implemented)

Doesn't seem particularly difficult to maintain, so go ahead!

[Usage]: Distributed Inference Without Docker.

Yes you can, it's just much more difficult without docker. Are the machines physically connected to each other? If they're not, I'd recommend not doing it - it would be...

[Bug]: .\gguf_to_torch.py broken along with direct load GGUF

That script is outdated, I forgot to remove it. We've been doing implicit conversion of GGUF models for a very long time.

[Bug]: strange repetition issue

I've been trying to replicate this, with some success: `DolphinPod/dolphin-2.9.1-llama3.1-8b`: ```json { "id": "chat-93aaefa981c046d497a5699308ad094d", "object": "chat.completion", "created": 1729716396, "model": "DolphinPod/dolphin-2.9.1-llama3.1-8b", "choices": [ { "index": 0, "message": { "role": "assistant", "content":...