Letong Han
Letong Han
## Type of Change feature API added: - /v1/assist/chat - /v1/assist/decode - /v1/assist/data_transfer ## Description Support Assisted Generation on Multi-nodes. The code framework is implemented. Details will be completed by...
## Type of Change Add NeuralChat example API not changed ## Description Add Multi-Socket LLM inference example for NeuralChat. Related DeepSpeed PR: https://github.com/microsoft/DeepSpeed/pull/4750 (not merged yet) ## Expected Behavior &...
## Description Fix tgi-gaudi-server issue of unable to use `hl-smi` and find device on Habana Gaudi. ## Issues n/a ## Type of change List the type of change like below....
## Description Add Nginx in CodeTrans. Modify `compose.yaml`, `README.md` and `set_env.sh` ## Issues n/a ## Type of change List the type of change like below. Please delete options that are...
## Description Add k8s manifest for nginx in CodeTrans ## Issues n/a ## Type of change List the type of change like below. Please delete options that are not relevant....