Rémi Louf
Rémi Louf
I think we should dedicate pages to how to use Outlines to connect to a vLLM server, an Ollama server, a llama.cpp server etc. It should be impossible to miss...
You can deactivate Outlines' cache using: ```python from outlines import caching cachine.disable_cache() ``` And use `functool`'s `lru_cache` instead.
Thank you for contributing! Could you add a test that fails on current `main` and passes here?
Thank you for making the change :) Is it good for review?
We deprecated the old CFG backend in favor of integrating xgrammar and llguidance.
Thanks for reporting the issue! FYI `transformers` uses greedy sampling by default: https://huggingface.co/docs/transformers/en/main_classes/text_generation
This has been bothering me for a while too, thank you for contributing! I think we might as well _return_ the template. This would allow string manipulations within the function...
> This has been bothering me for a while too, thank you for contributing! I think we might as well _return_ the template. This would allow string manipulations within the...
Thanks! Feel free to open a PR to fix this, it shouldn't be a big change to the current logic.
> Fix #1346, WDYT about the integration with `outlines.prompt`? We should allow functions that are not decorated, such as ```python def build_prompt(a: int) -> str: return f"What is {a} squared?"...