Jason Dai

Results 106 comments of Jason Dai

> **This bug is caused by using XMX kernel in a new thread**, it won't happen if running model in current thread. And I think its root cause is a...

> How about adding streaming llm directly into bigdl-llm instead of making it another example? It can benefit other applications /examples. Let's do that in a separate PR (with design...

> RuntimeError: PyTorch is not linked with support for xpu devices It seems the installed PyTorch does not support XPU. Can you share the specific PyTorch version installed, and try...

Add `import intel_extension_for_pytorch as ipex`?

I have actually made all the changes I want :-)

We are setting up a cluster for large scale testing as well.

Hi Matei - sorry we haven't had enough time to look into this yet. Maybe we should push it to a future release, as we'll be working on the graphx...

Small scale testing works fine, but we ran into some wired failures in large scale testing and had not had enough time to look into it.

Please see https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/vLLM-Serving

> This is written in Python, can C++ be added as well? Python's performance is not satisfactory. Unfortunately there is no such plan at this moment