Shunkangz
Results
2
issues of
Shunkangz
In the current implementation, we will return the first and second generated token together from generation worker. I refactor this logic and return the first generated token from context worker...
Add support of chat completion in PD and fix the include_usage option.