Woosuk Kwon

Results 284 comments of Woosuk Kwon

@gshtras Can you please provide a concrete number (how much improvements)?

@zhuohan123 Could you please take a look at this? This PR is pretty interesting.

@njhill I believe this can be related to #3466 ? Could you also take a look?

Hi @xinji1 Thanks for the detailed RFC and the PR, and sorry for the late reply. If I understand correctly, the speedup comes from: 1. Sorting the requests in the...

yeah let me fix the error The PR passed the pre-commit test in CI somehow.

@tlrmchlsmth Just wondering: Do you have any accuracy or performance benchmark numbers?

@njhill Good question. Actually, the MP backend would also work for TPUs. However, I think users such as GKE prefer Ray, because 1) they are interested in multi-host inference (which...

@jikunshang It seems like using IPEX somehow changes the output of the models and thus fails the model tests. Could you please take a look?

@markmc Is this PR waiting for review? Or is it in progress?

@markmc Can you please merge from main again?