Eric Buehler

Results 543 comments of Eric Buehler

Thank you, @ArthurZucker for the link! I was actually able to get the GPT2 conversion to work now!

@BHX2, @LLukas22 I just merged #262! You can use per-request LoRA activation which in all APIs. After setting up your adapter model ordering file, you can try it out: [examples...

@gregszumel, thanks for the explanation. I would love to see this added to Candle. If you want to contribute this to mistral.rs please feel free!

@LaurentMazare, thanks! I saw that PR and am very excited for it to be merged.

@gregszumel, that sounds great! If you decide to contribute it to mistral.rs, that would be much appreciated.

For future reference, here's the implementation: https://github.com/EricLBuehler/mistral.rs/blob/6aec940499be1cf72c628f7ddaa8b3e59bcb4fda/mistralrs-core/src/ops.rs#L482-L504

For speculative decoding, we need to run the target model with multiple tokens at once, once per step. If we need to run the target model with a full prompt,...

Ok, so just to confirm: it is this part? > https://github.com/huggingface/candle/pull/2111/files#diff-ed262e4bc9a4a093e64842a2f61a85e1713c4efde0618ac7b31ad58dc5d171e3R137-R149 I can add a PR for this to some of the models if you think it is a good...