tdasika17 comments

Results 14 comments of


                                            tdasika17

executorch model Inference time is higher than the torch model

Hi @GregoryComer , The timings I reported was with to_edge_transform_and_lower(). While experimenting I tried to_backend() and the inference time is around ~27 seconds here. So, My original observation of ~16...

executorch model Inference time is higher than the torch model

[pte_model_profiling.txt](https://github.com/user-attachments/files/19846795/pte_model_profiling.txt) Attaching the model profiling information.

executorch model Inference time is higher than the torch model

Ah, Thanks..!! I fixed that, modified graph to move '_aten_mm_default_' ops to use '_aten_mul_tensor_' instead, Now the time came to 3.5 seconds. Are there any other ops from the list...

executorch model Inference time is higher than the torch model

Hi @mcr229, I have used this option in my build already _**option(EXECUTORCH_BUILD_KERNELS_OPTIMIZED "" ON)**_ and linked the library '_**optimized_native_cpu_ops_lib**_' to my app. [pte_model_profiling_3seconds.txt](https://github.com/user-attachments/files/19886551/pte_model_profiling_3seconds.txt) Attaching the profiling of model with 3.5seconds....

executorch model Inference time is higher than the torch model

The model trace (.etdp) is collected at actual model inference, below is the code snippet. ``` Module model(model_path, Module::LoadMode::MmapUseMlockIgnoreErrors, std::move(etdump_gen_)); vector output_ids = generate(model, input_tokens); ETDumpGen* etdump_gen = static_cast(model.event_tracer()); ET_LOG(Info,...

executorch model Inference time is higher than the torch model

Hi, Sorry for the confusion. That is another python script to generate pt.model and export it. The 3.5 second, that I'm talking is purely c++ inference time, Which is timed...

executorch model Inference time is higher than the torch model

Hi @mcr229 , yes, the time is computed for overall inference, which contains injesion phase + output token generation, post processing of o/p tokens. I have converted a .pt model...

executorch model Inference time is higher than the torch model

@mcr229 , Here is the profiling of model during real inference. Generated etdump.bin while exporting, and used dev tools to generate inspector.txt. Attaching the excel related to the above table....

executorch model Inference time is higher than the torch model

Hi @mcr229 , The profiling is done on the same machine. When I meant the inference time ~3.5 Sec, I timed it out before and after the generate method. My...

executorch model Inference time is higher than the torch model

Hi @mcr229 , Today, I tried executorch model on a different x86 server. I got different inference time here for the same application (~7.8 sec), this may be because of...