executorch
executorch copied to clipboard
Android app - Error - Attempted to resize a static tensor to a new shape at dimension 0
My Android application fails with Attempted to resize a static tensor to a new shape at dimension 0 error. Please find the full logcat below.
The shape of input datasets in my model is not static. Specifically, the number of steps in any one sequence varies.
here is the code I use to define the input dataset for the model in the Android application:
float[] flat = flatten(tmpData);
final long[] shapeArrDataPytorchFlattened = new long[]{tmpData.length, 4, 1};
arrDataPytorch = Tensor.fromBlob(flat, shapeArrDataPytorchFlattened);
where 4 is the number of features and tmpData.length is the size of the input dataset (with n rows and 4 columns)
here is the code I use to run inference:
try {
Log.i(TAG, " - neuralNetworkloadAndRunPytorch - Abut to run inference --- ");
outputTensor = mModule.forward(from(arrDataPytorch)).toTensor();
} catch (Exception e) {
Log.i(TAG, " - neuralNetworkloadAndRunPytorch - Inference FAILED --- ");
throw new RuntimeException(e);
}
when I run inference on my model processed with torchscript and processed using pytorch mobile. I produce the input dataset as follows:
final long[] shapeArrDataPytorchFlattened = new long[]{1, flat.length}; //USED FOR PYTORCH MOBILE
arrDataPytorch = Tensor.fromBlob(flat, shapeArrDataPytorchFlattened);
and run inference as follows:
mModule = LiteModuleLoader.load(moduleFileAbsoluteFilePath);
outputTensor = mModule.forward(IValue.from(arrDataPytorch)).toTensor();
This works producing reasonable results.
I would appreciate any thoughts as to what is causing the problem, and how I might go about fixing it.
Thanks
LOGCAT
12-05 16:48:49.983: I/NeuralNetworkService(16887): - NeuralNetworkServiceRunnable - neuralNetworkInputPreparationRunning - 1 - 0
12-05 16:48:49.983: I/NeuralNetworkService(16887): - NeuralNetworkServiceRunnable - neuralNetworkLoadAndRunRunning - 0 - 0
12-05 16:48:49.983: I/NeuralNetworkService(16887): - NeuralNetworkServiceRunnable - About to run neuralNetworkloadAndRun ---
12-05 16:48:49.983: I/NeuralNetworkService(16887): - neuralNetworkloadAndRunPytorch - Running -
12-05 16:48:49.983: I/NeuralNetworkService(16887): - neuralNetworkloadAndRunPytorch - locationInformationDir - /data/user/0/com.android.contextq/files/locationInformation/
12-05 16:48:49.983: I/NeuralNetworkService(16887): - neuralNetworkloadAndRunPytorch - savedNetworkArchiveLength - 120669888
12-05 16:48:49.983: I/NeuralNetworkService(16887): - neuralNetworkloadAndRunPytorch - Abut to load module ---
12-05 16:48:50.067: I/ETLOG(16887): Model file /data/user/0/com.android.contextq/files/locationInformation/tfmodel_exnnpack.pte is loaded.
12-05 16:48:50.067: I/ETLOG(16887): Setting up planned buffer 0, size 23366800.
12-05 16:48:50.077: W/libc(16887): Access denied finding property "ro.hardware.chipname"
12-05 16:48:50.078: W/adbd(13666): timeout expired while flushing socket, closing
12-05 16:48:50.080: D/XNNPACK(16887): allocated 6144 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.080: D/XNNPACK(16887): created workspace of size 774176
12-05 16:48:50.081: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.085: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.088: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.092: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.092: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.092: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.092: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.096: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.097: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.113: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.127: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.130: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.130: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.130: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.130: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.132: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.146: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.150: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.150: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.150: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.150: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.152: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.166: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.170: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.170: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.170: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.170: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.172: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.186: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.190: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.190: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.190: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.190: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.192: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.206: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.209: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.213: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.217: D/XNNPACK(16887): allocated 8192 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.217: D/XNNPACK(16887): created workspace of size 1327136
12-05 16:48:50.217: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.221: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.224: D/XNNPACK(16887): reusing tensor id #8 memory for tensor id #5 Node #2 Softmax
12-05 16:48:50.224: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.225: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.225: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.229: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.232: I/XNNPACK(16887): fuse Clamp Node #2 into upstream Node #1
12-05 16:48:50.234: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.249: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.263: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.269: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.273: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.276: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.277: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.281: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.284: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.286: D/StNfcHal(979): (#0C838) Rx 60 07 01 e2
12-05 16:48:50.286: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.301: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.315: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.319: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.322: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.323: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.327: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.331: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.334: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.338: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.338: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.342: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.344: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.357: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.358: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.361: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.362: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.365: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.367: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.381: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.381: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.385: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.385: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.389: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.390: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.404: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.405: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.408: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.409: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.412: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.414: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.427: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.428: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.431: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.432: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.435: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.437: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.450: D/XNNPACK(16887): allocated 16416 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.467: I/NeuralNetworkService(16887): - neuralNetworkloadAndRunPytorch - Abut to run inference ---
12-05 16:48:50.467: I/ETLOG(16887): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 12716
12-05 16:48:50.467: I/ETLOG(16887): Error setting input 0: 0x10
12-05 16:48:50.467: I/ETLOG(16887): In function forward(), assert failed: set_input_status == Error::Ok
12-05 16:48:50.467: A/libc(16887): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 16905 (Thread-2), pid 16887 (lNetworkService)
12-05 16:48:50.635: I/crash_dump64(17226): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
The difference between Lite Interpreter (Pytorch Mobile) and ExecuTorch is that, in Executorch, we plan memory ahead of time, which can help us to re-use/reduce memory usage in runtime. What is the dynamic part in the origin pytorch model? Can the dynamic part be upper bound?
Reference doc: https://pytorch.org/executorch/stable/compiler-memory-planning.html
@cccclai Thanks for the response. Please bear with me as I am a beginner with Pytorch (my background is java and Android application development).
I did not create the model I am using (https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master). I modified slightly to accommodate my datasets (produced by my Android application).
When you ask about the dynamic part of the model, could you please clarify? Dynamic with respect to which variables/parts of the model?
The model is an attention-based Transformer Network. Depending on what you mean by dynamic, being an encoder/decoder model the input from one part of the model to another is dynamic.
I read the document you referenced in your message. Would it make sense for me to use the
alloc_graph_input=False option? If I did, it's not clear what I should do in my application to do this:
"If the IO is not planned then users will be expected to provide data buffers to back these values at runtime"
Did I misunderstand the documentation?
Thanks
If the IO is not planned then users will be expected to provide data buffers to back these values at runtime
What this means is that the inputs you provide will be a) copied to the memory planned for it during memory planning pass if IO was part of memory planning OR b) will not be copied since memory planning did not plan for this.
By default IO is planned and hence if follow this, https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/jni/jni_layer.cpp#L345, you will see that the output returned from the executor is references directly and there is a comment on the lifetime of the pointer referenced by the output tensor.
Now with respect to dynamic size. THere are a couple of things.
- Export + executorch support upper bound dynamic size. That is you can tag the input tensors to have bounded dynamic size. E.g. On each dimension of the input what is the max size that dim can take. This helps export understand what is the maximum size expected on input, output and intermediate tensors. You can read more here https://pytorch.org/docs/stable/export.html#expressing-dynamism and @JacobSzwejbka is an expert on this who can give more input.
- Do you know what part of your model is dynamic? You may not know this and thats fine. In that case I presume, you just expect that "I should be able to supply input of varying size", right?
- Shape dynamism gets bit more complicated if the model you are interested in is also lowered to some delegate like say XNNPACK. In that case the delegate also need to support dynamic shape. For XNNPACK for example, this is not the case yet, so wont quite be able to run model via XNNPACK that can take inputs of different shapes.
@kimishpatel Thanks!
a) It looks like I may have a problem as XNNPACK does not support shape dynamism as input sequences in the model I use are of varying length.
b) Your assumption is correct. I expected to be able to provide input of varying size
c) Given what @cccclai said in his comment regarding the difference between pytorch mobile and executorch, and also based on what you said in your point 1. above, I will try the torch.export.dynamic_dim() API.
d) Why is delegation (e.g. via XNNPACK) necessary in order to lower a model onto an edge device using executorch? Sorry for bringing it up again. With pytorch mobile it was not (unless I am mistaken). Is there an alternative to using XNNPACK?
Thanks
Why is delegation (e.g. via XNNPACK) necessary in order to lower a model onto an edge device using executorch?
Delegation is for delegate part of or the whole model to some powerful backends on device. Different edge devices may have different backends. XNNPACK(https://github.com/google/XNNPACK) is the one of the most powerful backends on CPU. For example, on iOS there will be some other powerful backends (https://github.com/pytorch/executorch/tree/main/backends/apple) like coreml and mps. Qualcomm chipset might too.
In PyTorch Mobile, XNNPACK is pretty much like a default backend and it runs after we call optimized_for_mobile. In PyTorch Mobile, we also have limited backends available on PyTorch Mobile like coreml (https://pytorch.org/tutorials/prototype/ios_coreml_workflow.html)
@cccclai Thanks. When I used pytorch mobile I chose to not optimize before lowering the model to the edge device. Pytorch mobile gave me that option, if I am not mistaken. Executorch does not give you the option to not optimize, right? By the way, I chose to not use optimization in order to first get the basic mechanism working (i.e. perform inference on the edge device successfully).
@adonnini to provide some context, in PyTorch mobile, the optimize_for_mobile step essentially applies a pre-defined set of transformation passes on the TorchScript model to optimize it for a specific processor (CPU, GPU, etc.).
With ExecuTorch, this process is a bit more involved. Essentially a model is represented by default using the Edge IR. However, since XNNPACK is a powerful library, we provide a delegate which will consume the Edge IR and convert the model to XNNPACK's representation. The converted graph can then be executed using XNNPACK.
Essentially, ExecuTorch provides more control over how your model executes.
Regarding your initial issue, would you mind sharing how you produced your model? As mentioned before, XNNPACK doesn't support dynamic shapes.
@SS-JIA Here is a link to the model I use: https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master I modified it slightly to work with my dataset. I also added executorch code to train.py. I use train.py to produce the model. For the time being, I commetend out the validation code. Is this the information you were looking for?
Hi, after fixing dynamic_dim error (https://github.com/pytorch/executorch/issues/1379) with @angelayi greatly appreciated help, I tried once again to use the model for inference in my Android app.
Unfortunately the result was once again a Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 11406 error as described earlier in this issue's thread. Below, you will find the latest traceback log.
Please let me know if you need me do do anything and what I should do next.
TRACEBACK LOG (IT'S LONG. i INCLUDED ALL MESSAGES RELATED TO THE FAILURE RATHER THAN ASSUMING WHAT IS RELEVANT)
01-15 16:08:10.386: I/ETLOG(12852): Model file /data/user/0/com.android.contextq/files/locationInformation/TptDelegate.pte is loaded.
01-15 16:08:10.386: I/ETLOG(12852): Setting up planned buffer 0, size 31460272.
01-15 16:08:10.400: W/libc(12852): Access denied finding property "ro.hardware.chipname"
01-15 16:08:10.422: D/XNNPACK(12852): allocated 6144 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.423: D/XNNPACK(12852): created workspace of size 774176
01-15 16:08:10.424: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.442: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.461: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.479: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.479: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.479: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.480: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.501: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.575: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.597: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.603: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.603: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.603: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.604: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.621: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.622: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.622: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.623: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.641: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.641: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.641: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.642: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.659: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.660: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.660: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.661: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.666: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.672: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.676: D/XNNPACK(12852): allocated 8192 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.676: D/XNNPACK(12852): created workspace of size 1327136
01-15 16:08:10.677: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.683: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.687: D/XNNPACK(12852): reusing tensor id #8 memory for tensor id #5 Node #2 Softmax
01-15 16:08:10.687: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.687: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.688: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.692: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.697: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.765: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.815: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.821: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.828: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.832: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.832: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.837: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.842: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.861: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.867: D/StNfcHal(1024): (#007DF) Rx 6f 02 0a (hidden)
01-15 16:08:10.880: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.899: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.916: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.917: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.926: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.933: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.938: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.943: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.944: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.950: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.951: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.957: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.958: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.963: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.963: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.968: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.969: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.974: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.974: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.979: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.979: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.984: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.984: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.989: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.990: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.994: D/XNNPACK(12852): allocated 16416 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:11.012: I/NeuralNetworkService(12852): - neuralNetworkloadAndRunPytorch - Abut to run inference ---
01-15 16:08:11.012: I/ETLOG(12852): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 11406
01-15 16:08:11.013: I/ETLOG(12852): Error setting input 0: 0x10
01-15 16:08:11.013: I/ETLOG(12852): In function forward(), assert failed: set_input_status == Error::Ok
01-15 16:08:11.013: A/libc(12852): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 12870 (Thread-2), pid 12852 (lNetworkService)
01-15 16:08:11.088: I/crash_dump64(13916): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
01-15 16:08:11.089: I/tombstoned(719): received crash request for pid 12870
01-15 16:08:11.089: I/crash_dump64(13916): performing dump of process 12852 (target tid = 12870)
01-15 16:08:11.390: A/DEBUG(13916): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-15 16:08:11.390: A/DEBUG(13916): Build fingerprint: 'Fairphone/FP4eea/FP4:13/TKQ1.230127.002/TP20:user/release-keys'
01-15 16:08:11.390: A/DEBUG(13916): Revision: '0'
01-15 16:08:11.390: A/DEBUG(13916): ABI: 'arm64'
01-15 16:08:11.390: A/DEBUG(13916): Timestamp: 2024-01-15 16:08:11.106599870+0100
01-15 16:08:11.390: A/DEBUG(13916): Process uptime: 381s
01-15 16:08:11.390: A/DEBUG(13916): Cmdline: com.android.contextq:ContextQNeuralNetworkService
01-15 16:08:11.390: A/DEBUG(13916): pid: 12852, tid: 12870, name: Thread-2 >>> com.android.contextq:ContextQNeuralNetworkService <<<
01-15 16:08:11.390: A/DEBUG(13916): uid: 10207
01-15 16:08:11.390: A/DEBUG(13916): signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
01-15 16:08:11.390: A/DEBUG(13916): x0 0000000000000000 x1 0000000000003246 x2 0000000000000006 x3 00000072fe287f50
01-15 16:08:11.390: I/LocationChangeManagement(5959): - lastKnownLocation - Last known location - lastKnownLocationString - Location[Provider= network, lat= 45.753405, lon= 8.312165, acc= 12, t= 1705331289499, et= 3625844163303, alt= 722.2000122070312, vel= -1.00, bear= -1.00, {Bundle[{networkLocationType=wifi}]}]
01-15 16:08:11.390: A/DEBUG(13916): x4 60651f7371647272 x5 60651f7371647272 x6 60651f7371647272 x7 7f7f7f7f7f7f7f7f
01-15 16:08:11.390: A/DEBUG(13916): x8 00000000000000f0 x9 0000007681304b28 x10 0000000000000001 x11 000000768134484c
01-15 16:08:11.390: A/DEBUG(13916): x12 00000072fe286520 x13 0000000000000044 x14 00000072fe287868 x15 0000000034155555
01-15 16:08:11.390: A/DEBUG(13916): x16 00000076813acd68 x17 00000076813884e0 x18 00000072fd7a2000 x19 0000000000003234
01-15 16:08:11.390: A/DEBUG(13916): x20 0000000000003246 x21 00000000ffffffff x22 000000769380e9d8 x23 000000769380e9d8
01-15 16:08:11.390: A/DEBUG(13916): x24 00000072fe2885f0 x25 b4000074c9870870 x26 0000000000000002 x27 0000007693abc378
01-15 16:08:11.390: A/DEBUG(13916): x28 00000072fe2884c0 x29 00000072fe287fd0
01-15 16:08:11.390: A/DEBUG(13916): lr 0000007681335788 sp 00000072fe287f30 pc 00000076813357b4 pst 0000000000001000
01-15 16:08:11.390: A/DEBUG(13916): backtrace:
01-15 16:08:11.390: A/DEBUG(13916): #00 pc 00000000000527b4 /apex/com.android.runtime/lib64/bionic/libc.so (abort+168) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
01-15 16:08:11.390: A/DEBUG(13916): #01 pc 0000000000b95590 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (et_pal_abort+8) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #02 pc 0000000000b95398 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (torch::executor::runtime_abort()+8) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #03 pc 0000000000b72dac /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)+596) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #04 pc 0000000000b73384 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::MethodWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (executorch_jni::ExecuTorchJni::*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>), &(executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)), executorch_jni::ExecuTorchJni, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::dispatch(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&)+236) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #05 pc 0000000000b7ba64 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::CallWithJniConversions<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&), facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&))+96) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #06 pc 0000000000b731b4 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::FunctionWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&), facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(_JNIEnv*, _jobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&))+64) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #07 pc 0000000000b6a754 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::MethodWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (executorch_jni::ExecuTorchJni::*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>), &(executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)), executorch_jni::ExecuTorchJni, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(_JNIEnv*, _jobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*)+44) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916): #08 pc 0000000000355830 /apex/com.android.art/lib64/libart.so (art_quick_generic_jni_trampoline+144) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.390: A/DEBUG(13916): #09 pc 000000000033eda4 /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.390: A/DEBUG(13916): #10 pc 0000000000511050 /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+1976) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #11 pc 0000000000498288 /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+4716) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #12 pc 0000000000357fd8 /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #13 pc 0000000000a29dd8 /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/oat/arm64/base.vdex (com.example.executorchdemo.executor.Module.forward+0)
01-15 16:08:11.391: A/DEBUG(13916): #14 pc 0000000000374120 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #15 pc 0000000000511d1c /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #16 pc 00000000004973dc /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+960) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #17 pc 0000000000357fd8 /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #18 pc 000000000000d4fc /data/data/com.android.contextq/code_cache/.overlay/base.apk/classes15.dex (com.android.contextq.neuralnetwork.NeuralNetworkService.neuralNetworkloadAndRunPytorch+0)
01-15 16:08:11.391: A/DEBUG(13916): #19 pc 0000000000374120 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #20 pc 0000000000511d1c /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #21 pc 000000000049774c /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+1840) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #22 pc 0000000000357fd8 /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #23 pc 0000000000007d44 /data/data/com.android.contextq/code_cache/.overlay/base.apk/classes15.dex (com.android.contextq.neuralnetwork.NeuralNetworkService$NeuralNetworkServiceRunnable.run+0)
01-15 16:08:11.391: A/DEBUG(13916): #24 pc 0000000000374120 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #25 pc 0000000000511d1c /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #26 pc 0000000000498288 /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+4716) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #27 pc 0000000000357fd8 /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #28 pc 000000000000308c [anon:dalvik-/apex/com.android.art/javalib/core-oj.jar-transformed] (java.lang.Thread.run+0)
01-15 16:08:11.391: A/DEBUG(13916): #29 pc 0000000000374120 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #30 pc 0000000000373a18 /apex/com.android.art/lib64/libart.so (artQuickToInterpreterBridge+964) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #31 pc 0000000000355968 /apex/com.android.art/lib64/libart.so (art_quick_to_interpreter_bridge+88) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #32 pc 000000000033eda4 /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #33 pc 0000000000239d54 /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+144) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #34 pc 000000000053a1b0 /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+1600) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916): #35 pc 00000000000ba650 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
01-15 16:08:11.391: A/DEBUG(13916): #36 pc 0000000000053ffc /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
@SS-JIA in your comment above you state: "XNNPACK doesn't support dynamic shapes." yet, @cccclai in his comment above states: "In PyTorch Mobile, XNNPACK is pretty much like a default backend and it runs after we call optimized_for_mobile."
My model does use dynamic shapes, and I was able to run it for inference successfully from my Android application using the PyTorch Mobile (skipping the optimization step) runtime engine.
If I was able to run my model successfully using PyTorch Mobile because I skipped the optimization step, then why is there not a way to skip optimization when using Executorch? This would seem to be a reasonable option to have.
As far as I know, models with dynamic shapes are not the exception. How will it be possible (when?) to run models with dynamic shapes on Android devices using the Executorch runtime engine?
If the answer to both questions above is negative, Then it looks like I will not be able to use Executorch for my models. That would be really too bad.
Please let me know if I misunderstood your comment and if I am missing something.
Thanks
@SS-JIA would it help if I sent you the .pte file produced when training my model using executorch?
Also, Here is a link to the model I use: https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master I modified it slightly to work with my dataset. I also added executorch code to train.py. I use train.py to produce the model. For the time being, I commetend out the validation code. If you like I can send you the modified train.py I used to produce the .pte file.
I hope you will have the time to let me know how I should proceed. Thanks
@mcr229 can you take a look and see the dynamic shape support issue in xnnpack
HI @adonnini XNNPACK Delegate currently can only support taking in inputs with static shapes. We are actively working on upstreaming dynamic shape support to XNNPACK, and once finished, we will be able to leverage this by updating our XNNPACK commit.
Thanks for the update. As far as you can tell at the moment, is it a matter of weeks before you will update the XNNPACK commit? Just so that I can plan accordingly. Thanks On Feb 14, 2024 6:08 PM, Max Ren @.***> wrote: HI @adonnini XNNPACK Delegate currently can only support taking in inputs with static shapes. We are actively working on upstreaming dynamic shape support to XNNPACK, and once finished, we will be able to leverage this by updating our XNNPACK commit.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
We expect to have this ready within the next two weeks.
Thanks!On Feb 14, 2024 6:29 PM, Max Ren @.***> wrote: We expect to have this ready within the next two weeks.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
@mcr229 I am still running into this issue. I am using the latest executorch release. Was the addition of dynamic shape support to XNNPACK completed and released? Thanks
@mcr229 @kimishpatel I just received this response from @alankelly https://github.com/google/XNNPACK/issues/6423 regarding XNNPACK dynamic shape support.. My first question is how can I check that I am using the release of XNNPACK with dynamic shape support? @alankelly seems to think that the problem I have is with executorch. He may have a point since the error log (see below) does not mention XNNPACK as the error in this issue does (see above). As I mentioned previously, I was able to run this model for inference from my Android app lowering it using pytorch mobile (using torchscript)
What should I do next? Please let me know if you need any information.
Thanks
ERROR LOG
05-12 16:50:23.542: E/ExecuTorch(12402): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 14415
05-12 16:50:23.542: E/ExecuTorch(12402): Error setting input 0: 0x10
05-12 16:50:23.542: A/ExecuTorch(12402): In function execute_method(), assert failed (result.ok()): Execution of method forward failed with status 0x12
05-12 16:50:23.542: A/libc(12402): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 12434 (Thread-2), pid 12402 (lNetworkService)
05-12 16:50:23.597: I/crash_dump64(12837): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
05-12 16:50:23.598: I/tombstoned(712): received crash request for pid 12434
05-12 16:50:23.603: I/crash_dump64(12837): performing dump of process 12402 (target tid = 12434)
05-12 16:50:23.728: W/adbd(9671): timeout expired while flushing socket, closing
05-12 16:50:23.829: A/DEBUG(12837): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
05-12 16:50:23.829: A/DEBUG(12837): Build fingerprint: 'Fairphone/FP4eea/FP4:13/TKQ1.230127.002/TP2D:user/release-keys'
05-12 16:50:23.829: A/DEBUG(12837): Revision: '0'
05-12 16:50:23.829: A/DEBUG(12837): ABI: 'arm64'
05-12 16:50:23.829: A/DEBUG(12837): Timestamp: 2024-05-12 16:50:23.608361388+0200
05-12 16:50:23.829: A/DEBUG(12837): Process uptime: 377s
05-12 16:50:23.829: A/DEBUG(12837): Cmdline: com.android.contextq:ContextQNeuralNetworkService
05-12 16:50:23.829: A/DEBUG(12837): pid: 12402, tid: 12434, name: Thread-2 >>> com.android.contextq:ContextQNeuralNetworkService <<<
05-12 16:50:23.829: A/DEBUG(12837): uid: 10207
05-12 16:50:23.829: A/DEBUG(12837): signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
05-12 16:50:23.829: A/DEBUG(12837): Abort message: 'In function execute_method(), assert failed (result.ok()): Execution of method forward failed with status 0x12'
05-12 16:50:23.829: A/DEBUG(12837): x0 0000000000000000 x1 0000000000003092 x2 0000000000000006 x3 0000007970a42e30
05-12 16:50:23.829: A/DEBUG(12837): x4 72601f2b2827636e x5 72601f2b2827636e x6 72601f2b2827636e x7 7f7f7f7f7f7f7f7f
05-12 16:50:23.829: A/DEBUG(12837): x8 00000000000000f0 x9 0000007d0a45ab28 x10 0000000000000001 x11 0000007d0a49a84c
05-12 16:50:23.829: A/DEBUG(12837): x12 0000007970a41400 x13 000000000000006f x14 0000007970a42748 x15 0000000034155555
05-12 16:50:23.829: A/DEBUG(12837): x16 0000007d0a502d68 x17 0000007d0a4de4e0 x18 000000796fac0000 x19 0000000000003072
05-12 16:50:23.829: A/DEBUG(12837): x20 0000000000003092 x21 00000000ffffffff x22 0000007cfd41da00 x23 0000007cfd41da00
05-12 16:50:23.829: A/DEBUG(12837): x24 0000007970a435b0 x25 b400007b3371b560 x26 0000000000002072 x27 0000007cfd9143e8
05-12 16:50:23.829: A/DEBUG(12837): x28 0000007970a43480 x29 0000007970a42eb0
05-12 16:50:23.829: A/DEBUG(12837): lr 0000007d0a48b788 sp 0000007970a42e10 pc 0000007d0a48b7b4 pst 0000000000001000
Please let me know on what I should do next in order to resolve this issue, and if you need any information. Thanks
@mcr229 can you take a look?
A quick update. As a backup/temporary solution, while working on resolving the issues with executorch, I used torchscript to produce a lowered model for use with the pytorch mobile runtime. The process worked. I was able to load the lowered model successfully. To a certain extent, this test seems to indicate that the current issues are not related to the model. This is not a solution. My goal is to use executorch. I really hope we can make some progress on resolving the current issues soon. Thanks
The error indeed is not coming from XNNPACK, and it seems to be thrown within executorch. Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 14415 seems to suggest that the dynamic tensor has been marked as static. Was the model exported with dynamic shapes? I saw that you're using this model: https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master
do you also happen to have a code pointer of how you're exporting the model?
cc. @JacobSzwejbka @cccclai
@mcr229 Thanks for taking a look. Below, you will find the code I use to export the model. Is this what you are looking for? Please let me know if you need any other information. Thanks
CODE
pre_autograd_aten_dialect = capture_pre_autograd_graph(m, (enc_input, dec_input, dec_source_mask, dec_target_mask))
aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect, (enc_input, dec_input, dec_source_mask, dec_target_mask), strict=False)
edge_program: EdgeProgramManager = to_edge(aten_dialect)
to_be_lowered_module = edge_program.exported_program()
from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend
lowered_module = edge_program.to_backend(XnnpackPartitioner())
save_path = save_path = "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/models/tpt_delegate.pte"
with open(save_path, "wb") as f:
f.write(lowered_module.to_executorch().buffer)
hi @adonnini, when enabling dynamic shapes for the executorch model, you can specify the dynamic shapes when capturing the graph. Here is an example:
from torch.export import Dim
class Basic(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
return x + y
f = Basic()
example_args = (torch.randn(3, 3), torch.randn(3, 3))
dim1_x = Dim("dim1_x", min=1, max=10)
dynamic_shapes = {"x": {1: dim1_x}, "y": {1: dim1_x}}
pre_autograd_aten_dialect = capture_pre_autograd_graph(
f, example_args, dynamic_shapes=dynamic_shapes
)
aten_dialect: ExportedProgram = export(f, example_args, dynamic_shapes=dynamic_shapes)
print("ATen Dialect Graph")
print(aten_dialect)
afterwards you can follow the same flow of to_edge --> to_backend --> f.write().
@mcr229 Thanks! I'll do as you suggest. The one thing that gives me a little pause is the underlying assumption in what you suggest that I would know a priori the maximum dimension. I guess that as a (temporary) work around I can make an educated guess. However, I am concerned that in a production environment this would not work. Is the idea that I should give max a value high enough to cover the vast majority of cases? For example, could/should I give max a value equal to 1000000? Thanks
@adonnini I believe ExecuTorch does upper bounded memory planning and I know that XNNPACK delegate does as well. While I'm not entirely sure how executorch will do with very large max values with respect to memory planning. The XNNPACK delegated portions will use the upper bound to do initial memory planning. XNNPACK will actually be able to go above the maximum value, however this will come at the cost of some performance as we will reallocate memory for the new larger amount at that inference. I fear using too large a maximum value for XNNPACK as it may throw errors with not enough memory as it tries to allocate memory for extremely large intermediate tensors. So I would say to try to put the most realistic maximum tensor size.
cc. @JacobSzwejbka, @cccclai, @larryliu0820 for the dynamic memory planning
@mcr229 after adding the dynamic shapes code, execution failed producing the traceback log reported below. For your reference, below you will also find the code that produced the failure.
The executorch related code is inserted in the training epoch loop. It runs after execution of a training step which based on the logs ran successfully. I am pointing this out because I find this line in the traceback log puzzling (Not surprising):
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['enc_input'].size()[1] = 7 is not equal to L['dec_input'].size()[1] = 12
enc_input and dec_input in the model can be but are not expected to be equal.
Here is a print of their shapes:
- train_minimum - Lowering the Whole Module - enc_input.shape - torch.Size([27, 7, 2])
- train_minimum - Lowering the Whole Module - dec_input.shape - torch.Size([27, 12, 3])
- train_minimum - Lowering the Whole Module - dec_source_mask.shape - torch.Size([27, 1, 7])
- train_minimum - Lowering the Whole Module - dec_target_mask.shape - torch.Size([27, 12, 12])
Probably, I just don't understand the error statement.
Please let me know what I should do next, and if you need any additional information.
Thanks
TRACEBACK LOG
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Error while creating guard:
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Name: ''
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Source: shape_env
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Create Function: SHAPE_ENV
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Guard Types: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Code List: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Object Weakref: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Guarded Class Weakref: None
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] Created at:
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 482, in transform
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] tracer = InstructionTranslator(
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2060, in __init__
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] output=OutputGraph(
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 310, in __init__
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] self.init_ambient_guards()
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 421, in init_ambient_guards
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] self.guards.add(ShapeEnvSource().make_guard(GuardBuilder.SHAPE_ENV))
0%| | 0/5 [00:25<?, ?it/s]
Traceback (most recent call last):
File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train-minimum.py", line 438, in <module>
pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/__init__.py", line 151, in capture_pre_autograd_graph
m = torch._dynamo.export(
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1354, in inner
raise constraint_violation_error
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1311, in inner
result_traced = opt_f(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
return fn(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
return callback(frame, cache_entry, hooks, frame_state, skip=1)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
return _compile(
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
r = func(*args, **kwargs)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 634, in compile_inner
check_fn = CheckFunctionManager(
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/guards.py", line 1048, in __init__
guard.create(builder)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_guards.py", line 249, in create
return self.create_fn(builder, self)
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/guards.py", line 705, in SHAPE_ENV
guards = output_graph.shape_env.produce_guards(
File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2946, in produce_guards
raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['enc_input'].size()[1] = 7 is not equal to L['dec_input'].size()[1] = 12
CODE
dim1_x = Dim("dim1_x", min=1, max=100000)
dynamic_shapes = {"enc_input": {1: dim1_x}, "dec_input": {1: dim1_x}, "dec_source_mask": {1: dim1_x}, "dec_target_mask": {1: dim1_x}}
pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
(enc_input, dec_input, dec_source_mask, dec_target_mask), dynamic_shapes=dynamic_shapes)
aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect,
print(" - train_minimum - Lowering the Whole Module - ATen Dialect Graph")
print(" - train_minimum - Lowering the Whole Module - aten_dialect - ", aten_dialect)
edge_program: EdgeProgramManager = to_edge(aten_dialect)
to_be_lowered_module = edge_program.exported_program()
from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend
lowered_module = edge_program.to_backend(XnnpackPartitioner())
print(" - train_minimum - Lowering the Whole Module - lowered_module - ", lowered_module)
save_path = save_path = "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/models/tpt_delegate.pte"
with open(save_path, "wb") as f:
f.write(lowered_module.to_executorch().buffer)
@adonnini the statement means that a guard was generated during export that checks to ensure that L['enc_input'].size()[1] == L['dec_input'].size()[1]. In the dynamic range you have provided this constraint is being violated when L['enc_input'].size()[1] is 7 and L['dec_input'].size()[1] is 12. You can enable detailed logging to see which line of model source code generated this guard so that we can maybe potentially change the code or change the constraint range. To do this you can add to the top of the export script this code:
os.environ["TORCH_LOGS"] = "+dynamo"
torch._logging._init_logs()
In the logs search for "guard added" and you should be able to see which line of model source code generated this guard.
@tarun292 I will do as you ask and let you know what I find. However, I am puzzled/perplexed by this statement:
guard was generated during export that checks to ensure that L['enc_input'].size()[1] == L['dec_input'].size()[1]
I don't understand why this check would be done/enabled in the first place since, unless I am mistaken, it is not the case that enc_input'].size()[1] and ['dec_input'].size()[1] be equal. Where does that requirement come from?
What am I missing / doing wrong?
Thanks
@adonnini before or after (near this log) there should be another print indicating which source line generated this guard. Are you able to see that?