Arthur Cheng

Results 3 comments of Arthur Cheng

> "Yet, RTN is often believed to lag behind more advanced quantization techniques in two crucial areas – generation throughput and accuracy." How does it look like now with your...

@frankzhouhr I can offer some help. This is a feature we want for a long time.

@justinSmileDate Thanks for the interest. By default we take advantage of the k8s service to distribute the inference traffic. Additionally we do have a more complicated design of and support...