Lu Ken comments

Repositories
Issues
Comments

Results 4 comments of


                                            Lu Ken

Diarization pipeline v3.1 is much slower than 3.0 when running on CPU

> Thanks @hbredin , loading into memory really helped - with that, the performance is tolerable and 1h file finishes within a few minutes (

Diarization pipeline v3.1 is much slower than 3.0 when running on CPU

I have tested with "Diarization pipeline v3.0" by using CPU, and also found its latency is less than v3.1 (50s -> 30s)

Native integration with KEDA for LLM inference autoscaling

since vllm support continue batching for handing multiple request, but big batch will results long TOF. Could we also conside the increasing or decreasing the batch?

AMX isa Native addition

Question: Is the AMX enabling only need to add compiler option? Without change any gmm code for tile operation? Thanks! Have you test whether AMX_BUSY found via perf when running...