flexorRegev
flexorRegev
@fmmoret @Yard1 From what it looks in this PR - there isn't anything inherent to the fact that quantized models + lora aren't supported right now - it just wasn't...
Was also trying to run Yi with the same problem.. @Yard1 can you elaborate on what's needed to support? I'd be glad to work on this PR
@Yard1 How does the Gemma avoid this? it also has a huge vocab_size
Also made it work with a gemma-like adaptation to Yi and it works :)
> 2\. Ability to skip ahead if there is no choice between tokens (next token is dictated by a schema) How would you think about creating this? since the sampler...
Few questions here: 1. Why did you hardcode the max_len at 512 tokens? 2. It seems like there are like 3 locations for the configuration of max_length - this causes...
Cool, right now I added a parameter for the document length and I'm setting it upstream manually (not ideal but working) Tried using a png and it looks pretty good...