Unable to Convert DeepFilterNet to ExecuTorch
🐛 Describe the bug
Thank you for your documentation on executorch. I am working on deepfilternet and want to make it run on edge devices. I am trying to convert it into executorch but getting following issue;
Followed [(https://pytorch.org/tutorials/intermediate/torch_export_tutorial.html#custom-ops)] for executorch implementation of DeepFilterNet PureTorch https://github.com/grazder/DeepFilterNet/blob/1097015d53ced78fb234e7d7071a5dd4446e3952/torchDF/torch_df_streaming.py.
I am getting following error when I run these lines:
example_input = (chunked_audio[0].to(device), states.to(device), atten_lim_db.to(device)) #torch_streaming_model = torch.jit.load(model_path).to(device) replace_custom_layers(torch_streaming_model) scripted_model = torch.jit.trace(torch_streaming_model)
Export the scripted model
executorch_model = export(scripted_model,example_input)
Traceback (most recent call last): File "/usr/lib/python3.10/inspect.py", line 2547, in _signature_from_callable sig = _get_signature_of(call) File "/usr/lib/python3.10/inspect.py", line 2468, in _signature_from_callable return _signature_from_builtin(sigcls, obj, File "/usr/lib/python3.10/inspect.py", line 2275, in _signature_from_builtin raise ValueError("no signature found for builtin {!r}".format(func)) ValueError: no signature found for builtin <instancemethod call at 0x7f21a4d1e470>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/d/puretorch/DeepFilterNet/torchDF/custom_ops_imp.py", line 99, in
Versions
Collecting environment information... PyTorch version: 2.1.0+cpu Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.22.1 Libc version: glibc-2.35
Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz CPU family: 6 Model: 140 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 BogoMIPS: 5606.40 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect flush_l1d arch_capabilities Virtualization: VT-x Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 192 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 5 MiB (4 instances) L3 cache: 12 MiB (1 instance) Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected
Versions of relevant libraries: [pip3] numpy==1.24.4 [pip3] onnx==1.16.0 [pip3] onnxruntime==1.17.0 [pip3] onnxruntime_extensions==0.10.1 [pip3] onnxscript==0.1.0.dev20240521 [pip3] onnxsim==0.4.36 [pip3] torch==2.1.0+cpu [pip3] torchaudio==2.1.0+cpu [conda] Could not collect
Don't export the jit scripted model. Use torch.export on the original nn.module
Can you try and let us know
Thank you for your response. Without torch.jit.script, I am getting following issue:
/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py:634: UserWarning: Was not able to add assertion to guarantee correct input input_audio to specialized function. It is up to the user to make sure that your inputs match the inputs you specialized the function with.
warnings.warn(
/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()'
torch.has_cuda,
/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()'
torch.has_cudnn,
/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()'
torch.has_mps,
/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()'
torch.has_mkldnn,
Traceback (most recent call last):
File "/mnt/d/puretorch/DeepFilterNet/torchDF/executorch_check.py", line 681, in deepfilternet3
2024-07-09 16:38:28 | INFO | DF | Found checkpoint /home/anamrasool/.cache/DeepFilterNet/DeepFilterNet3/checkpoints/model_120.ckpt.best with epoch 120
2024-07-09 16:38:28 | INFO | DF | Running on device cpu
2024-07-09 16:38:28 | INFO | DF | Model loaded
Traceback (most recent call last):
File "/mnt/d/puretorch/DeepFilterNet/torchDF/executorch_check.py", line 681, in
from user code: File "/mnt/d/puretorch/DeepFilterNet/torchDF/executorch_check.py", line 619, in forward ) = self.torch_streaming_model( File "/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/mnt/d/puretorch/DeepFilterNet/torchDF/executorch_check.py", line 466, in forward ) = self.unpack_states(states) File "/mnt/d/puretorch/DeepFilterNet/torchDF/executorch_check.py", line 417, in unpack_states torch.tensor(torch.nonzero(erb_norm_state).shape[0] == 0),
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
Also, I want to know do I need to do anything about custom layers in my model for torch.export.export()???
Thanks for trying it out @AnamRasool-pixel. Is this the nn.Module that you are trying to export? https://github.com/grazder/DeepFilterNet/blob/1097015d53ced78fb234e7d7071a5dd4446e3952/torchDF/torch_df_streaming.py#L18
Could you also send us the export command you tried with?
Yes it is an nn.Module. Initially I was trying to save the model using torch.jit.script but it was resulting into recursivescriptedmodule and was not exported using torch.export.export(). Now, I am using the following command. I have tried using it both with and without capture_pre_autograd_graph. Both ways give the same error. I think the model architecture is not traceable and it is resulting into graph breaks but I am not sure. I want to deploy deepfilternet on edge devices. That is why I was trying to implement it in executorch. device = 'cpu' states_full_len = self.torch_streaming_model.states_full_len states = torch.zeros(states_full_len, device=device) atten_lim_db = torch.tensor(0.0, device=device) example_input = (chunked_audio[0].to(device), states.to(device), atten_lim_db.to(device)) pre_autograd_aten_dialect = capture_pre_autograd_graph(self.torch_streaming_model, example_input) aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect,example_input)
Thanks for the clarification and error logs @AnamRasool-pixel. The nn.Module looks a bit complicated, it may require some rewriting to get it to export.
From the logs, it looks like it's failing on the condition torch.tensor(torch.nonzero(erb_norm_state).shape[0] == 0) with error message Could not infer dtype of SymBool.
There was a fix added for this in https://github.com/pytorch/pytorch/pull/125656. Can you try updating to main / the most recent nightly and trying again? Your version of torch is on 2.1, whereas main is on 2.5.
From https://pytorch.org/
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
cc @angelayi for export issues.
Thank you so much for the response. It solved the problem I was facing. You are right the code is complicated and I have to rewrite some parts of it to make it exportable. Now I am having following issue: File "/mnt/d/puretorch/DeepFilterNet/torchDF/check.py", line 507, in forward apply_gains = self.is_apply_gains(lsnr) File "/mnt/d/puretorch/DeepFilterNet/torchDF/check.py", line 323, in is_apply_gains if self.always_apply_all_stages:
I think there is a need to rewrite the is_apply_gains(). Can you guide me how can I re-write
def is_apply_gains(self, lsnr: Tensor) -> Tensor:
"""
Original code - libDF/src/tract.rs - is_apply_stages()
This code decomposed for better graph capturing
Parameters:
lsnr: Tensor[Float] - predicted lsnr value
Returns:
output: Tensor[Bool] - whether to apply gains or not
"""
if self.always_apply_all_stages:
return torch.ones_like(lsnr, dtype=torch.bool)
return torch.le(lsnr, self.max_db_erb_thresh) * torch.ge(lsnr, self.min_db_thresh)
to make it exportable using torch.export.export()??
@AnamRasool-pixel, glad to hear you're making progress. Could you paste the error logs you're seeing now?
As an aside, it looks like the code has calls to item(), tolist() and nonzero(). These may also cause issues during export. You can try to resolve them by adding checks to hint these values to the compiler, see Dealing with GuardOnDataDependentSymNodes.
I have exported the deepfilternet by re-writing some functions of the model. Now I want to implement the second step exir.to_edge but I am unable to install executorch in my local machine may be because the environment made for deepfilternet execution is read-only. I get following error when I try to install executorch using command pip install executorch==0.2.1 ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/anamrasool/.cache/pypoetry/virtualenvs/deepfilternet-v6NBIXXw-py3.10/lib/python3.10/site-packages/sortedcontainers'.
I tried another approach also I saved the exported model as: aten_dialect: ExportedProgram = export(self.torch_streaming_model,example_input) torch.save(aten_dialect, 'aten_exported_model.pt') print("successful")
When I try to load it in colab using the following command: aten_dialect: ExportedProgram = torch.load(model_path), I get following error:
TypeError Traceback (most recent call last)
4 frames
/usr/local/lib/python3.10/dist-packages/torch/serialization.py in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args) 1023 except RuntimeError as e: 1024 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None -> 1025 return _load(opened_zipfile, 1026 map_location, 1027 pickle_module,
/usr/local/lib/python3.10/dist-packages/torch/serialization.py in _load(zip_file, map_location, pickle_module, pickle_file, overall_storage, **pickle_load_args) 1444 unpickler = UnpicklerWrapper(data_file, **pickle_load_args) 1445 unpickler.persistent_load = persistent_load -> 1446 result = unpickler.load() 1447 1448 torch._utils._validate_loaded_sparse_tensors()
/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py in reduce_graph_module(body, import_block) 127 fn_src = body.get("_code") or body["code"] 128 forward = _forward_from_src(import_block + fn_src, {}) --> 129 return _deserialize_graph_module(forward, body) 130 131
/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py in _deserialize_graph_module(forward, body, graph_module_cls) 191 192 tracer_extras = body.get("_tracer_extras", {}) --> 193 graph = KeepModules().trace(com, **tracer_extras) 194 195 # Manually set Tracer class on the reconstructed Graph, to avoid
TypeError: _ModuleStackTracer.init() missing 1 required positional argument: 'scope_root'
Any guidance regarding these issues would be much appreciated.
Hi @AnamRasool-pixel, I'm wondering if using a virtual env for executorch will resolve the read-only issue? From https://pytorch.org/executorch/stable/getting-started-setup.html
# Create and activate a conda environment named "executorch"
conda create -yn executorch python=3.10.0
conda activate executorch
And, were you able to run through the ExecuTorch setup outside of the deepfilternet environment?
Yes, I was able to run executorch outside the deepfilternet environment. Thanks a lot for the guidance.
@lucylq I have converted DeepFilterNet to executorch and saved the converted .pte file. I am using a virtual environment for this process. Now I want to build its c++ implementation. I am following steps mentioned in [https://pytorch.org/executorch/stable/getting-started-setup.html]. When I run this line: ./cmake-out/executor_runner --model_path dfn_exe.pte, I get following error: (executorchRun) anamrasool@Younite-013:/mnt/d/executorch/executorch$ ./cmake-out/executor_runner --model_path dfn_exe.pte I 00:00:00.038415 executorch:executor_runner.cpp:73] Model file dfn_exe.pte is loaded. I 00:00:00.038464 executorch:executor_runner.cpp:82] Using method forward I 00:00:00.038468 executorch:executor_runner.cpp:129] Setting up planned buffer 0, size 23972608. I 00:00:00.055151 executorch:executor_runner.cpp:152] Method loaded. I 00:00:00.055357 executorch:executor_runner.cpp:162] Inputs prepared. E 00:00:00.219480 executorch:tensor_util_portable.cpp:62] Expected tensor to have default or channels last dim order, but got E 00:00:00.219513 executorch:tensor_util_portable.cpp:66] dim_order(0): 0 E 00:00:00.219516 executorch:tensor_util_portable.cpp:66] dim_order(1): 2 E 00:00:00.219517 executorch:tensor_util_portable.cpp:66] dim_order(2): 1 E 00:00:00.219518 executorch:tensor_util_portable.cpp:66] dim_order(3): 3 E 00:00:00.219519 executorch:kernel_ops_util.cpp:310] Check failed (tensor_is_default_or_channels_last_dim_order(weight)): E 00:00:00.219521 executorch:op_convolution.cpp:281] Check failed (check_convolution_args( in, weight, bias, stride, padding, dilation, transposed, output_padding, groups, out)): E 00:00:00.219523 executorch:method.cpp:1027] KernelCall failed at instruction 0:106 in operator aten::convolution.out: 0x12 E 00:00:00.219540 executorch:method.cpp:1036] arg 0 with type id 1 E 00:00:00.219542 executorch:method.cpp:1036] arg 1 with type id 1 E 00:00:00.219545 executorch:method.cpp:1036] arg 2 with type id 0 E 00:00:00.219546 executorch:method.cpp:1036] arg 3 with type id 8 E 00:00:00.219547 executorch:method.cpp:1036] arg 4 with type id 8 E 00:00:00.219547 executorch:method.cpp:1036] arg 5 with type id 8 E 00:00:00.219548 executorch:method.cpp:1036] arg 6 with type id 5 E 00:00:00.219549 executorch:method.cpp:1036] arg 7 with type id 8 E 00:00:00.219551 executorch:method.cpp:1036] arg 8 with type id 4 E 00:00:00.219552 executorch:method.cpp:1036] arg 9 with type id 1 E 00:00:00.219555 executorch:method.cpp:1036] arg 10 with type id 1 F 00:00:00.219556 executorch:executor_runner.cpp:166] In function main(), assert failed (status == Error::Ok): Execution of method forward failed with status 0x12 Aborted.
I checked that deepfilternmet convolutional layers use the channels first convention. And executor_runner works on channel last convention. Any work around for this?
Hey @AnamRasool-pixel, thanks for your patience. @Gasoonjia, do you know any workarounds for this? Or maybe @SS-JIA for convolution?
Hi @AnamRasool-pixel from the error message the dim order is 0, 2, 1, 3 for one of your tensors. It is neither a channels_last tensor (0 2 3 1) nor a contiguous (channels_first) tensor (0 1 3 4), so our system cannot handle that.
- We have noticed that there're some dim order generation issue in the core pytorch. . https://github.com/pytorch/pytorch/pull/131366 mitigate the issue and just approved. You can try on the PR later to see if everything goes well.
- Please help us to check whether your model is using memory format other than contiguous or channels_last on 4d tensor. If you are using something different, consider updating your code to those two formats.
If the above two cannot solve your issue, we are also working on some workaround stuff on ET side. Will keep you post if we make any progress.
Thanks for sharing your feedback!
Thanks for the response. I have checked the tensors of my model, they are either contiguous or channels last. Why am I getting error on dimensions when I try to build runtime for exported deepfilternet model??
Also, I tried to create the module of my exported deepfilternet file using the cmake file generated following these steps:
Clean and configure the CMake build system. Compiled programs will appear in the executorch/cmake-out directory we create here.
(rm -rf cmake-out && mkdir cmake-out && cd cmake-out && cmake ..)
Build the executor_runner target
cmake --build cmake-out --target executor_runner -j9
but I get following issue:
[ 97%] Linking CXX executable executor_runner
/usr/bin/ld: CMakeFiles/executor_runner.dir/examples/portable/executor_runner/executor_runner.cpp.o: in function main': /mnt/d/executorch/executorch/examples/portable/executor_runner/executor_runner.cpp:40: undefined reference to torch::executor::Module::Module(std::__cxx11::basic_string<char, std::char_traitstorch::executor::Module::forward(std::vector<torch::executor::EValue, std::allocator<torch::executor::EValue> > const&)': /mnt/d/executorch/executorch/../executorch/extension/module/module.h:167: undefined reference to torch::executor::Module::execute(std::__cxx11::basic_string<char, std::char_traits
How should I proceed? which method is advisable? with using Module or without using it?
Hi @AnamRasool-pixel For the dim order stuff, I think there might be some bugs in our runtime part when generating the dim order. Let me double check and share you my feedback.
For building issue, @dbort do you have any suggestions? I think Module is a good entrance?
Hi @lucylq Can you please update me on my query?
Hi @dbort, I have been trying to make module of my deepfilternet.pte file but I am getting linking issues while including module.h. I am using the cmakelists.txt given here [https://github.com/pytorch/executorch/blob/main/CMakeLists.txt]. I have made changes in the cmakelists.txt to add extenstion_module using the following command
target_link_libraries(executor_runner ${_executor_runner_libs} extension_module)
but I am getting following errors: /usr/bin/ld: cannot find -lextension_module: No such file or directory collect2: error: ld returned 1 exit status gmake[2]: *** [CMakeFiles/executor_runner.dir/build.make:183: executor_runner] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:293: CMakeFiles/executor_runner.dir/all] Error 2 gmake: *** [Makefile:136: all] Error 2
I think there is some missing libextension.so file.
Any solution to this issue?
Hey @AnamRasool-pixel,
Thanks for your patience; to build executor_runner with extension module, you'll need a few more changes. Can you try this: https://github.com/pytorch/executorch/pull/4972/files
This enables EXECUTORCH_BUILD_EXTENSION_MODULE and EXECUTORCH_BUILD_EXTENSION_DATA_LOADER, so extension_module will be built and available before being linked to executor_runner.
And following https://github.com/pytorch/executorch/tree/main/examples/portable, run these commands:
# Build the tool from the top-level `executorch` directory.
(rm -rf cmake-out \
&& mkdir cmake-out \
&& cd cmake-out \
&& cmake -DEXECUTORCH_PAL_DEFAULT=posix ..) \
&& cmake --build cmake-out -j32 --target executor_runner
This works well for me locally - let me know how that goes for you?
Thanks @lucylq for the response. The solution you gave me solved the problem of linking module.h but when I run the main.cpp it gives me errors related to kernel-registers. Also, I observed the executorch repo is constantly updating. I tried to clone the executorch repo again and I got messed up in new errors. Any idea that which version of executorch should be used if I want to convert a deep learning model to executorch to deploy it on mobile devices. Thanks for replying to all my queries and guiding me through this. I am messed up right now and I have no clue about even if this conversion is possible or not since there are custom layers in the model which I changed before exporting the model.
Hi @AnamRasool-pixel,
Sorry to hear you're running into all these issues, and thank you for being patient and continuing to try with ExecuTorch. It sounds like you are close; previously you could run the model in ExecuTorch, and ran into the dim order issue.
Regarding a stable version, ExecuTorch beta is releasing late October, and that will have more API stability. You can try the release branch (latest is v0.3.0), though iirc some of the issues you encountered were resolved by updating to a more recent version.
For now, can you try:
- Clean your environment;
rm -rf cmake-out && rm -rf pip-out, and follow the installation instructions. - You should be able to run your exported .pte file using a runtime from ExecuTorch main. If you see errors building from a fresh install, please paste your command and error message.
I have cloned the v0.3.0 branch, and built the cmake again. I dont get any issues while linking the module.h library. When I try to run the main.cpp which includes module.h, I get following error: E 00:00:00.001914 executorch:operator_registry.cpp:75] Re-registering aten::sym_size.int, from NOT_SUPPORTED E 00:00:00.001963 executorch:operator_registry.cpp:76] key: (null), is_fallback: true F 00:00:00.001966 executorch:operator_registry.cpp:29] In function register_kernels(), assert failed (false): Kernel registration failed with error 18, see error log for details. Aborted.
Hi @AnamRasool-pixel, not sure why we're seeing a double registration here. Can you provide a PR containing your code changes with repro instructions?
Hey @lucylq,
I've been working on exporting my DeepFilterNet model to the edge dialect using Executorch. You can find my work here: https://github.com/AnamRasool-pixel/Executorch-export-DFN.git.
The code responsible for exporting the model is located in torchDF/executorch_export_model.py. I've used the CMake you provided to link the module.h file when building the exported dfn_exe.pte file.
I'm trying to run the exported model using this code: https://github.com/AnamRasool-pixel/Executorch-DFN/blob/Build-DFN-exe/examples/portable/executor_runner/executor_runner.cpp. However, I'm encountering an error:
E 00:00:00.001914 executorch:operator_registry.cpp:75] Re-registering aten::sym_size.int, from NOT_SUPPORTED E 00:00:00.001963 executorch:operator_registry.cpp:76] key: (null), is_fallback: true F 00:00:00.001966 executorch:operator_registry.cpp:29] In function register_kernels(), assert failed (false): Kernel registration failed with error 18, see error log for details. Aborted.
Thanks for your help!
hey @lucylq, Any update regarding my query?
Hi @AnamRasool-pixel, sorry for the delay, just returned from leave. Are you still seeing this issue? I have a suspicion that it's due to executorch_no_prim_ops being included twice and causing the double registration. I'll give it a try and get back to you.
hey @lucylq, Thanks for the reply. Yes I am still seeing this issue. Please do give it a try and tell me if I can do anything to make it work.
Hi @AnamRasool-pixel,
Ah, I think the issue is that we're linking extension module to executor_runner, and both link the executorch lib which contains the primops, which is why we're seeing the double registration. Sorry, that's my bad. I think we should either use Module or build a runner such as the sample executor_runner, and not mix them.
-
Moduleis an API that allows you to run .pte files without some of the overhead of creating your own runner. You can follow the docs to useModule. There are some examples in the tests as well. I have a feeling this may fail with the same dim-order issue, though we can try it. -
executor_runneris a sample runner that can be used to run .pte files. Trying this with your .pte file I still see the dim-order issue. cc @Gasoonjia have there been any updates on dim-order side?
(executorch) [[email protected] /data/users/lfq/executorch (8957dc8a)]$ ./cmake-out/executor_runner --model_path dfn_exe.pte
I 00:00:00.015929 executorch:executor_runner.cpp:82] Model file dfn_exe.pte is loaded.
I 00:00:00.015981 executorch:executor_runner.cpp:91] Using method forward
I 00:00:00.015994 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 23972608.
I 00:00:00.043892 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:00.044248 executorch:executor_runner.cpp:171] Inputs prepared.
E 00:00:00.045855 executorch:tensor_util_portable.cpp:128] Check failed (all_contiguous || all_channels_last): 2 input tensors have different dim orders
E 00:00:00.045874 executorch:op_expand_copy.cpp:88] Check failed (tensors_have_same_dim_order(self, out)):
E 00:00:00.045878 executorch:method.cpp:1038] KernelCall failed at instruction 0:54 in operator aten::expand_copy.out: 0x12
E 00:00:00.045881 executorch:method.cpp:1047] arg 0 with type id 1
E 00:00:00.045883 executorch:method.cpp:1047] arg 1 with type id 8
E 00:00:00.045885 executorch:method.cpp:1047] arg 2 with type id 5
E 00:00:00.045887 executorch:method.cpp:1047] arg 3 with type id 1
E 00:00:00.045888 executorch:method.cpp:1047] arg 4 with type id 1
F 00:00:00.045890 executorch:executor_runner.cpp:175] In function main(), assert failed (status == Error::Ok): Execution of method forward failed with status 0x12
Aborted (core dumped)