quiet-star
quiet-star copied to clipboard
Add infer code
This PR add a file that contains the minimal code to infer the model with a consistent output.
This seems very slow to infer 100 tokens but output a consistent output
What do you think?