FastChat
FastChat copied to clipboard
Specify ASCEND NPU for inference.
trafficstars
Why are these changes needed?
When deploying inference services with ASCEND NPU, it is not possible to specify the card to be used. @infwinston @CodingWithTim
Related issue number (if applicable)
Checks
- [x] I've run
format.shto lint the changes in this PR. - [x] I've included any doc changes needed.
- [x] I've made sure the relevant tests are passing (if applicable).