torchchat
torchchat copied to clipboard

Published 20 hours ago •

Reame
Issues

Add max-autotune for CPU, update profile and fix next token calculation

Open yanbing-j opened this issue 6 months ago • 5 comments

This PR is to add max-autotune for CPU in torch.compile. Meanwhile, split first token and next token in the log print.

Aug 23 '24 08:08 yanbing-j