torchchat
torchchat copied to clipboard
Add max-autotune for CPU, update profile and fix next token calculation
This PR is to add max-autotune for CPU in torch.compile. Meanwhile, split first token and next token in the log print.