intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard
[NeuralChat] Support Assisted Generation on Multi-nodes
Type of Change
feature API added:
- /v1/assist/chat
- /v1/assist/decode
- /v1/assist/data_transfer
Description
Support Assisted Generation on Multi-nodes. The code framework is implemented. Details will be completed by Wangyi's team. JIRA: https://jira.devtools.intel.com/browse/NLPTOOLKIU-1126
Expected Behavior & Potential Risk
The assisted generation restful api will be able to run on multi-nodes.
How has this PR been tested?
Local. Draft PR now.
Dependency Change?
None.