nopperl

Results 7 issues of nopperl

This PR adds a dockerfile for reproducible builds.

It might make sense to add the option of using [Conda](https://docs.conda.io/en/latest/) to setup the local dev environment. It should make it easier for contributors as it's cross-platform and doesn't require...

enhancement

### 📚 The doc issue Since [llama.cpp](https://github.com/ggerganov/llama.cpp) now supports the OLMo architecture, it might make sense to mention this inference and quantization option in the readme. It's especially useful for...

type/documentation

The readme instructs to use the old `torch.distributed.launch` command instead of `torch.distributed.run`, which is incompatible with `finetune/finetune.py` because it sets the local rank using `--local-rank` instead of `--local_rank`, leading to...

## Description Add support to run the UI using docker. Since the previous PRs (https://github.com/vladmandic/automatic/pull/403, https://github.com/vladmandic/automatic/pull/844) stalled, I merged their approaches and fixed remaining issues. ## Notes To improve security,...

- [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [x] Medium - [ ] High This PR adds support for the [Chameleon](https://github.com/facebookresearch/chameleon)...

python
Review Complexity : Medium

### Your current environment vllm@cf069aa ### 🐛 Describe the bug Running models using the transformers fallback fails if `vllm_config.model_config.hf_config` does not contain `head_dim`. For example, using `Qwen/Qwen2.5-0.5B-Instruct`: ```python from vllm...

bug