AlpinDale comments

Results 170 comments of


                                            AlpinDale

[Bug]: Segfault with deepseek v2

This might end up being an issue that I'll need to discuss upstream with the vLLM team. The mixtral_quant modeling code there doesn't use the FusedMoE implementation, and does expert...

[Feature]: Batch embedding and reranking

Yes, this is definitely doable. I'll work on this as soon as I have the bandwidth.

[Bug]: Can't seem to disable enforcement of Eager mode.

fpX quants currently enforce eager mode, due to a bug in their kernels. This will be addressed soon, but maybe we should log this behaviour. Thanks for reporting.

[Bug]: Several errors when deploying GGUF models

This issue has been fixed on the latest main. There are no releases for it yet (and likely won't be on PyPi for a while, due to their wheel size...

[Bug]: Several errors when deploying GGUF models

> > due to their wheel size restrictions > > They made an exception for pytorch [pypa/packaging-problems#96](https://github.com/pypa/packaging-problems/issues/96) We requested an extension before at pypi/support#4036, and it was approved. But since...

[Kernel][Comms] feat: add custom all-gather kernels

/gemini review

[Attention] feat: support PrefixLM

/gemini review

[Bug]: VPTQ quantitative model Inference error

We do support Qwen2.5. It's not listed in the supported models list because it uses the same architecture as Qwen2: `Qwen2ForCausalLM`. See here https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/495f39366efef23836d0cfae4fbe635880d2be31/config.json#L3 We don't support Qwen2.5-Vision yet, because...

[Bug]: VPTQ quantitative model Inference error

This seems to be an issue with the quantized model, looks like one of (or all) the layers doesn't have a config defined for it. Maybe @wejoncy has an idea?

[Bug]: Docker latest [FATAL tini (19)] exec /app/aphrodite-engine/docker/entrypoint.sh failed: No such file or directory

That's a very outdated image that I can't delete from runpod for some reason.