Shunkangz

Results 2 issues of Shunkangz

In the current implementation, we will return the first and second generated token together from generation worker. I refactor this logic and return the first generated token from context worker...

Add support of chat completion in PD and fix the include_usage option.