flashinfer icon indicating copy to clipboard operation
flashinfer copied to clipboard

Basic inference example for LLama/Mistral

Open vgoklani opened this issue 1 year ago • 3 comments

Hey there,

Thanks for sharing your library!

Is there a basic Llama/Mistral example implemented that we could read through?

I'd like to test the inference code on the Mistral 7B reference implementation. Thanks!

vgoklani avatar Feb 05 '24 21:02 vgoklani

Hi @vgoklani , good idea and I'm thinking about a minimal end-to-end example (~500 loc), please stay tuned :)

yzh119 avatar Feb 07 '24 08:02 yzh119

Thanks! Something using nanoGPT (framework independent) would be great!

vgoklani avatar Feb 07 '24 08:02 vgoklani

any update on this?

Manojbhat09 avatar Mar 22 '24 11:03 Manojbhat09