quiet-star Add infer code

Add infer code

Open ostix360 opened this issue 11 months ago • 4 comments

This PR add a file that contains the minimal code to infer the model with a consistent output.

This seems very slow to infer 100 tokens but output a consistent output

What do you think?

Mar 21 '24 20:03 ostix360