Max Ren comments

Results 59 comments of


                                            Max Ren

executorch model Inference time is higher than the torch model

@tdasika17 just wanted to follow up to make sure we have a path forward for this. Let me know if you're still encountering issues with the high inference times.

executorch model Inference time is higher than the torch model

@tdasika17 could you share the profiled timings associated with the above table? The previous timings .txt files show that there were native_call_mm.out, however the ops lists shared don't have native_call_mm.out....

executorch model Inference time is higher than the torch model

Hi @tdasika17 , based on your timings it looks like the model inference is only taking 71ms, and model load is around 167ms. I believe this should be significantly faster...

executorch model Inference time is higher than the torch model

@tdasika17 thanks for the clarification! The executorch model should be expected to be faster than the pytorch model on CPU, but it might be helpful to share how the pytorch...

executorch model Inference time is higher than the torch model

@tdasika17 yes this is definitely useful. I think the graph you sent is for a single thread. Would it be possible to share the entire .svg file? this would definitely...

executorch model Inference time is higher than the torch model

would it be possible to share the flame graph files? it'll help with inspecting the call stacks. On cursory look at these, I can't immediately tell what the discrepancy is...

executorch model Inference time is higher than the torch model

> For Torch model, I just ran the Torch c++ application and captured the graphs for C++ application execution what do you mean by this? What is the capture flow?...

Kleidi introduce new 16x4 kernels

We see some rather significant speed up on prefill performance for Llama Models: ### Before: ``` I 00:00:05.587790 executorch:stats.h:84] Prompt Tokens: 64 Generated Tokens: 63 I 00:00:05.587793 executorch:stats.h:90] Model Load...

Kleidi introduce new 16x4 kernels

@alankelly @gonnet