Walter Nelson

Results 27 comments of Walter Nelson

The `select` vs `gen` discrepancy happens with: * L3: `unsloth/llama-3-70b-Instruct-bnb-4bit` * L3: `meta-llama/meta-llama-3-8B-Instruct` (so it's not internal `unsloth` modifications causing issues, at least) The whitespace issue happens with: * Phi:...

And here are my experiments that suggest that this relates to the need to heal tokens around the "boundaries" between prompting and generation: ``` >>> tokenizer.encode(prompt + "{\n ") [128000,...

`LlamaCpp` produces the same results as `transformers`. ```python from transformers import AutoModelForCausalLM, AutoTokenizer, PretrainedConfig import guidance from llama_cpp import Llama print("guidance version: ", guidance.__version__) model_name = "meta-llama/meta-llama-3-8B-Instruct" # only for...

Yes, I think you're right! Here's an example where it still happens (this time, a discrepancy between two `gen` calls with overlapping prompts), even without token healing being a factor...

Thanks folks -- yes, absolutely, I love the JSON option when it's available to me. I haven't checked it for this minimal reproducible example (because it's just that for me...

Ok, neat! I had assumed that guidance internally was doing something similar to outlines w.r.t. regex-based JSON generation, but after looking at the code more closely it looks like the...

@riedgar-ms @hudson-ai So, I gave it a shot. [wjn0/guidance@improve-json-schema-support](https://github.com/wjn0/guidance/tree/improve-json-schema-support) contains a few hackish changes that I required for my schema. These are not legitimate fixes (i.e. you wouldn't want these...