Tianqi Chen

Results 637 comments of Tianqi Chen

we have updated the process lately to focus on jit compilation so closing for now

do you mind try out the python api https://llm.mlc.ai/docs/deploy/python_engine.html and provide a reproducible script that can bring ths error?

Thank you, do you also mind comment about the GPu you have and the vram size ?

Thanks for the suggestion. This is something that we are planning to do with broader set of model support. We are ramping up some infrastructures and looking to have something...

context https://github.com/mlc-ai/mlc-llm/issues/1744

Would be great to confirm if the particular run executes on a separate thread. The main issue is not sample, but the cost of Synchronize(aka the cost of waiting for...

I see, then it might due to the fact that we are overusing the gpu resources for prefill.

We are moving towards a new engine and android sdk, so this issues as of now is less relevant, closing for now. We might still face some throttling issue in...

@mos-fine contribution is more than welcomed!

thanks @mos-fine do you mind send s pull request to add the support officially ? would be good to see the diff