Luca Beurer-Kellner

Results 149 comments of Luca Beurer-Kellner

Dataclass support is a feature in preview. The core team is more focused on the next major version of LMQL right now which will also help with this particular feature,...

Hi there. The error message suggests there may be an issue with your installation of bitsandbytes or transformers. Maybe this helps: https://github.com/oobabooga/text-generation-webui/issues/2397

Similar to Anthropic #118 models, GCP does not currently seem to offer support for model steering, i.e. masking of the model distribution during text generation: https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/text This means, we can...

This is somewhat by design, but we should probably add better client-side error messages, when users try to configure this from there. The thinking here is that the concrete inference...

I am working on enabling this soon, it requires some more changes with respect to stopping generation early though, so it will not be immediately available. One thing that may...

Thanks for raising this, we will keep it on our radar. It should be simple to add support for this, once it is upstreamed in llama.cpp/llama-cpp-python.

Marking this as a good first issue for backend work. The `llama.cpp` backend lives in https://github.com/eth-sri/lmql/blob/main/src/lmql/models/lmtp/backends/llama_cpp_model.py and is currently limited to `max_batch_size` of 1.

Hi there, I just tried to reproduce this on my workstation, and it seems to all work. Can you make sure to re-install llama.cpp with the correct build flags to...

Hi there Moritz, with `transformers` we use `device_map='auto'`, which should automatically make use of all available GPUs (e.g. specified via CUDA_VISIBLE_DEVICES). Could you check with `nvidia-smi` that all GPUs are...

Can you try running a long generation, just with the `transformers` API, `AutoModelForCausalLM.generate` and `device_map="auto"`. From your description, it sounds like the model is cloned to each card, not distributed.