executorch icon indicating copy to clipboard operation
executorch copied to clipboard

Android app - Error - Attempted to resize a static tensor to a new shape at dimension 0

Open adonnini opened this issue 2 years ago • 47 comments

My Android application fails with Attempted to resize a static tensor to a new shape at dimension 0 error. Please find the full logcat below.

The shape of input datasets in my model is not static. Specifically, the number of steps in any one sequence varies.

here is the code I use to define the input dataset for the model in the Android application:

			float[] flat = flatten(tmpData);
			final long[]  shapeArrDataPytorchFlattened = new long[]{tmpData.length, 4, 1};
			arrDataPytorch = Tensor.fromBlob(flat, shapeArrDataPytorchFlattened);

where 4 is the number of features and tmpData.length is the size of the input dataset (with n rows and 4 columns)

here is the code I use to run inference:

			try {
				Log.i(TAG, " - neuralNetworkloadAndRunPytorch - Abut to run inference --- ");
				outputTensor = mModule.forward(from(arrDataPytorch)).toTensor();
			} catch (Exception e) {
				Log.i(TAG, " - neuralNetworkloadAndRunPytorch - Inference FAILED --- ");
				throw new RuntimeException(e);
			}

when I run inference on my model processed with torchscript and processed using pytorch mobile. I produce the input dataset as follows:

		        final long[] shapeArrDataPytorchFlattened = new long[]{1, flat.length};   //USED FOR PYTORCH MOBILE
			arrDataPytorch = Tensor.fromBlob(flat, shapeArrDataPytorchFlattened);

and run inference as follows:

				mModule = LiteModuleLoader.load(moduleFileAbsoluteFilePath);
				outputTensor = mModule.forward(IValue.from(arrDataPytorch)).toTensor();

This works producing reasonable results.

I would appreciate any thoughts as to what is causing the problem, and how I might go about fixing it.

Thanks

LOGCAT

12-05 16:48:49.983: I/NeuralNetworkService(16887):  - NeuralNetworkServiceRunnable - neuralNetworkInputPreparationRunning - 1 - 0
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - NeuralNetworkServiceRunnable - neuralNetworkLoadAndRunRunning - 0 - 0
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - NeuralNetworkServiceRunnable - About to run neuralNetworkloadAndRun --- 
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - Running - 
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - locationInformationDir - /data/user/0/com.android.contextq/files/locationInformation/
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - savedNetworkArchiveLength - 120669888
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - Abut to load module --- 
12-05 16:48:50.067: I/ETLOG(16887): Model file /data/user/0/com.android.contextq/files/locationInformation/tfmodel_exnnpack.pte is loaded.
12-05 16:48:50.067: I/ETLOG(16887): Setting up planned buffer 0, size 23366800.
12-05 16:48:50.077: W/libc(16887): Access denied finding property "ro.hardware.chipname"
12-05 16:48:50.078: W/adbd(13666): timeout expired while flushing socket, closing
12-05 16:48:50.080: D/XNNPACK(16887): allocated 6144 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.080: D/XNNPACK(16887): created workspace of size 774176
12-05 16:48:50.081: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.085: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.088: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.092: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.092: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.092: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.092: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.096: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.097: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.113: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.127: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.130: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.130: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.130: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.130: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.132: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.146: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.150: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.150: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.150: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.150: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.152: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.166: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.170: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.170: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.170: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.170: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.172: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.186: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.190: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.190: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.190: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.190: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.192: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.206: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.209: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.213: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.217: D/XNNPACK(16887): allocated 8192 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.217: D/XNNPACK(16887): created workspace of size 1327136
12-05 16:48:50.217: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.221: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.224: D/XNNPACK(16887): reusing tensor id #8 memory for tensor id #5 Node #2 Softmax
12-05 16:48:50.224: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.225: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.225: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.229: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.232: I/XNNPACK(16887): fuse Clamp Node #2 into upstream Node #1
12-05 16:48:50.234: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.249: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.263: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.269: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.273: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.276: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.277: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.281: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.284: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.286: D/StNfcHal(979): (#0C838) Rx 60 07 01 e2 
12-05 16:48:50.286: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.301: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.315: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.319: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.322: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.323: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.327: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.331: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.334: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.338: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.338: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.342: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.344: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.357: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.358: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.361: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.362: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.365: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.367: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.381: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.381: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.385: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.385: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.389: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.390: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.404: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.405: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.408: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.409: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.412: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.414: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.427: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.428: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.431: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.432: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.435: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.437: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.450: D/XNNPACK(16887): allocated 16416 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.467: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - Abut to run inference --- 
12-05 16:48:50.467: I/ETLOG(16887): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 12716
12-05 16:48:50.467: I/ETLOG(16887): Error setting input 0: 0x10
12-05 16:48:50.467: I/ETLOG(16887): In function forward(), assert failed: set_input_status == Error::Ok
12-05 16:48:50.467: A/libc(16887): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 16905 (Thread-2), pid 16887 (lNetworkService)
12-05 16:48:50.635: I/crash_dump64(17226): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto

adonnini avatar Dec 05 '23 16:12 adonnini

The difference between Lite Interpreter (Pytorch Mobile) and ExecuTorch is that, in Executorch, we plan memory ahead of time, which can help us to re-use/reduce memory usage in runtime. What is the dynamic part in the origin pytorch model? Can the dynamic part be upper bound?

Reference doc: https://pytorch.org/executorch/stable/compiler-memory-planning.html

cccclai avatar Dec 06 '23 18:12 cccclai

@cccclai Thanks for the response. Please bear with me as I am a beginner with Pytorch (my background is java and Android application development).

I did not create the model I am using (https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master). I modified slightly to accommodate my datasets (produced by my Android application).

When you ask about the dynamic part of the model, could you please clarify? Dynamic with respect to which variables/parts of the model?

The model is an attention-based Transformer Network. Depending on what you mean by dynamic, being an encoder/decoder model the input from one part of the model to another is dynamic.

I read the document you referenced in your message. Would it make sense for me to use the alloc_graph_input=False option? If I did, it's not clear what I should do in my application to do this: "If the IO is not planned then users will be expected to provide data buffers to back these values at runtime" Did I misunderstand the documentation?

Thanks

adonnini avatar Dec 06 '23 20:12 adonnini

If the IO is not planned then users will be expected to provide data buffers to back these values at runtime

What this means is that the inputs you provide will be a) copied to the memory planned for it during memory planning pass if IO was part of memory planning OR b) will not be copied since memory planning did not plan for this.

By default IO is planned and hence if follow this, https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/jni/jni_layer.cpp#L345, you will see that the output returned from the executor is references directly and there is a comment on the lifetime of the pointer referenced by the output tensor.

Now with respect to dynamic size. THere are a couple of things.

  1. Export + executorch support upper bound dynamic size. That is you can tag the input tensors to have bounded dynamic size. E.g. On each dimension of the input what is the max size that dim can take. This helps export understand what is the maximum size expected on input, output and intermediate tensors. You can read more here https://pytorch.org/docs/stable/export.html#expressing-dynamism and @JacobSzwejbka is an expert on this who can give more input.
  2. Do you know what part of your model is dynamic? You may not know this and thats fine. In that case I presume, you just expect that "I should be able to supply input of varying size", right?
  3. Shape dynamism gets bit more complicated if the model you are interested in is also lowered to some delegate like say XNNPACK. In that case the delegate also need to support dynamic shape. For XNNPACK for example, this is not the case yet, so wont quite be able to run model via XNNPACK that can take inputs of different shapes.

kimishpatel avatar Dec 07 '23 15:12 kimishpatel

@kimishpatel Thanks!

a) It looks like I may have a problem as XNNPACK does not support shape dynamism as input sequences in the model I use are of varying length.

b) Your assumption is correct. I expected to be able to provide input of varying size

c) Given what @cccclai said in his comment regarding the difference between pytorch mobile and executorch, and also based on what you said in your point 1. above, I will try the torch.export.dynamic_dim() API.

d) Why is delegation (e.g. via XNNPACK) necessary in order to lower a model onto an edge device using executorch? Sorry for bringing it up again. With pytorch mobile it was not (unless I am mistaken). Is there an alternative to using XNNPACK?

Thanks

adonnini avatar Dec 07 '23 21:12 adonnini

Why is delegation (e.g. via XNNPACK) necessary in order to lower a model onto an edge device using executorch?

Delegation is for delegate part of or the whole model to some powerful backends on device. Different edge devices may have different backends. XNNPACK(https://github.com/google/XNNPACK) is the one of the most powerful backends on CPU. For example, on iOS there will be some other powerful backends (https://github.com/pytorch/executorch/tree/main/backends/apple) like coreml and mps. Qualcomm chipset might too.

In PyTorch Mobile, XNNPACK is pretty much like a default backend and it runs after we call optimized_for_mobile. In PyTorch Mobile, we also have limited backends available on PyTorch Mobile like coreml (https://pytorch.org/tutorials/prototype/ios_coreml_workflow.html)

cccclai avatar Dec 11 '23 19:12 cccclai

@cccclai Thanks. When I used pytorch mobile I chose to not optimize before lowering the model to the edge device. Pytorch mobile gave me that option, if I am not mistaken. Executorch does not give you the option to not optimize, right? By the way, I chose to not use optimization in order to first get the basic mechanism working (i.e. perform inference on the edge device successfully).

adonnini avatar Dec 11 '23 20:12 adonnini

@adonnini to provide some context, in PyTorch mobile, the optimize_for_mobile step essentially applies a pre-defined set of transformation passes on the TorchScript model to optimize it for a specific processor (CPU, GPU, etc.).

With ExecuTorch, this process is a bit more involved. Essentially a model is represented by default using the Edge IR. However, since XNNPACK is a powerful library, we provide a delegate which will consume the Edge IR and convert the model to XNNPACK's representation. The converted graph can then be executed using XNNPACK.

Essentially, ExecuTorch provides more control over how your model executes.

Regarding your initial issue, would you mind sharing how you produced your model? As mentioned before, XNNPACK doesn't support dynamic shapes.

SS-JIA avatar Dec 15 '23 00:12 SS-JIA

@SS-JIA Here is a link to the model I use: https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master I modified it slightly to work with my dataset. I also added executorch code to train.py. I use train.py to produce the model. For the time being, I commetend out the validation code. Is this the information you were looking for?

adonnini avatar Dec 15 '23 06:12 adonnini

Hi, after fixing dynamic_dim error (https://github.com/pytorch/executorch/issues/1379) with @angelayi greatly appreciated help, I tried once again to use the model for inference in my Android app.

Unfortunately the result was once again a Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 11406 error as described earlier in this issue's thread. Below, you will find the latest traceback log.

Please let me know if you need me do do anything and what I should do next.

TRACEBACK LOG (IT'S LONG. i INCLUDED ALL MESSAGES RELATED TO THE FAILURE RATHER THAN ASSUMING WHAT IS RELEVANT)

01-15 16:08:10.386: I/ETLOG(12852): Model file /data/user/0/com.android.contextq/files/locationInformation/TptDelegate.pte is loaded.
01-15 16:08:10.386: I/ETLOG(12852): Setting up planned buffer 0, size 31460272.
01-15 16:08:10.400: W/libc(12852): Access denied finding property "ro.hardware.chipname"
01-15 16:08:10.422: D/XNNPACK(12852): allocated 6144 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.423: D/XNNPACK(12852): created workspace of size 774176
01-15 16:08:10.424: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.442: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.461: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.479: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.479: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.479: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.480: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.501: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.575: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.597: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.603: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.603: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.603: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.604: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.621: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.622: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.622: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.623: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.641: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.641: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.641: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.642: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.659: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.660: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.660: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.661: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.666: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.672: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.676: D/XNNPACK(12852): allocated 8192 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.676: D/XNNPACK(12852): created workspace of size 1327136
01-15 16:08:10.677: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.683: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.687: D/XNNPACK(12852): reusing tensor id #8 memory for tensor id #5 Node #2 Softmax
01-15 16:08:10.687: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.687: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.688: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.692: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.697: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.765: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.815: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.821: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.828: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.832: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.832: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.837: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.842: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.861: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.867: D/StNfcHal(1024): (#007DF) Rx 6f 02 0a (hidden)
01-15 16:08:10.880: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.899: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.916: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.917: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.926: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.933: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.938: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.943: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.944: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.950: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.951: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.957: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.958: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.963: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.963: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.968: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.969: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.974: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.974: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.979: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.979: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.984: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.984: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.989: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.990: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.994: D/XNNPACK(12852): allocated 16416 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:11.012: I/NeuralNetworkService(12852):  - neuralNetworkloadAndRunPytorch - Abut to run inference --- 
01-15 16:08:11.012: I/ETLOG(12852): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 11406
01-15 16:08:11.013: I/ETLOG(12852): Error setting input 0: 0x10
01-15 16:08:11.013: I/ETLOG(12852): In function forward(), assert failed: set_input_status == Error::Ok
01-15 16:08:11.013: A/libc(12852): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 12870 (Thread-2), pid 12852 (lNetworkService)
01-15 16:08:11.088: I/crash_dump64(13916): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
01-15 16:08:11.089: I/tombstoned(719): received crash request for pid 12870
01-15 16:08:11.089: I/crash_dump64(13916): performing dump of process 12852 (target tid = 12870)
01-15 16:08:11.390: A/DEBUG(13916): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-15 16:08:11.390: A/DEBUG(13916): Build fingerprint: 'Fairphone/FP4eea/FP4:13/TKQ1.230127.002/TP20:user/release-keys'
01-15 16:08:11.390: A/DEBUG(13916): Revision: '0'
01-15 16:08:11.390: A/DEBUG(13916): ABI: 'arm64'
01-15 16:08:11.390: A/DEBUG(13916): Timestamp: 2024-01-15 16:08:11.106599870+0100
01-15 16:08:11.390: A/DEBUG(13916): Process uptime: 381s
01-15 16:08:11.390: A/DEBUG(13916): Cmdline: com.android.contextq:ContextQNeuralNetworkService
01-15 16:08:11.390: A/DEBUG(13916): pid: 12852, tid: 12870, name: Thread-2  >>> com.android.contextq:ContextQNeuralNetworkService <<<
01-15 16:08:11.390: A/DEBUG(13916): uid: 10207
01-15 16:08:11.390: A/DEBUG(13916): signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
01-15 16:08:11.390: A/DEBUG(13916):     x0  0000000000000000  x1  0000000000003246  x2  0000000000000006  x3  00000072fe287f50
01-15 16:08:11.390: I/LocationChangeManagement(5959):  - lastKnownLocation - Last known location - lastKnownLocationString - Location[Provider= network, lat= 45.753405, lon= 8.312165, acc= 12,  t= 1705331289499, et= 3625844163303, alt= 722.2000122070312, vel= -1.00, bear= -1.00, {Bundle[{networkLocationType=wifi}]}]
01-15 16:08:11.390: A/DEBUG(13916):     x4  60651f7371647272  x5  60651f7371647272  x6  60651f7371647272  x7  7f7f7f7f7f7f7f7f
01-15 16:08:11.390: A/DEBUG(13916):     x8  00000000000000f0  x9  0000007681304b28  x10 0000000000000001  x11 000000768134484c
01-15 16:08:11.390: A/DEBUG(13916):     x12 00000072fe286520  x13 0000000000000044  x14 00000072fe287868  x15 0000000034155555
01-15 16:08:11.390: A/DEBUG(13916):     x16 00000076813acd68  x17 00000076813884e0  x18 00000072fd7a2000  x19 0000000000003234
01-15 16:08:11.390: A/DEBUG(13916):     x20 0000000000003246  x21 00000000ffffffff  x22 000000769380e9d8  x23 000000769380e9d8
01-15 16:08:11.390: A/DEBUG(13916):     x24 00000072fe2885f0  x25 b4000074c9870870  x26 0000000000000002  x27 0000007693abc378
01-15 16:08:11.390: A/DEBUG(13916):     x28 00000072fe2884c0  x29 00000072fe287fd0
01-15 16:08:11.390: A/DEBUG(13916):     lr  0000007681335788  sp  00000072fe287f30  pc  00000076813357b4  pst 0000000000001000
01-15 16:08:11.390: A/DEBUG(13916): backtrace:
01-15 16:08:11.390: A/DEBUG(13916):       #00 pc 00000000000527b4  /apex/com.android.runtime/lib64/bionic/libc.so (abort+168) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
01-15 16:08:11.390: A/DEBUG(13916):       #01 pc 0000000000b95590  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (et_pal_abort+8) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #02 pc 0000000000b95398  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (torch::executor::runtime_abort()+8) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #03 pc 0000000000b72dac  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)+596) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #04 pc 0000000000b73384  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::MethodWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (executorch_jni::ExecuTorchJni::*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>), &(executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)), executorch_jni::ExecuTorchJni, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::dispatch(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&)+236) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #05 pc 0000000000b7ba64  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::CallWithJniConversions<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&), facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&))+96) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #06 pc 0000000000b731b4  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::FunctionWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&), facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(_JNIEnv*, _jobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&))+64) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #07 pc 0000000000b6a754  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::MethodWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (executorch_jni::ExecuTorchJni::*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>), &(executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)), executorch_jni::ExecuTorchJni, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(_JNIEnv*, _jobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*)+44) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #08 pc 0000000000355830  /apex/com.android.art/lib64/libart.so (art_quick_generic_jni_trampoline+144) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.390: A/DEBUG(13916):       #09 pc 000000000033eda4  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.390: A/DEBUG(13916):       #10 pc 0000000000511050  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+1976) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #11 pc 0000000000498288  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+4716) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #12 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #13 pc 0000000000a29dd8  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/oat/arm64/base.vdex (com.example.executorchdemo.executor.Module.forward+0)
01-15 16:08:11.391: A/DEBUG(13916):       #14 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #15 pc 0000000000511d1c  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #16 pc 00000000004973dc  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+960) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #17 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #18 pc 000000000000d4fc  /data/data/com.android.contextq/code_cache/.overlay/base.apk/classes15.dex (com.android.contextq.neuralnetwork.NeuralNetworkService.neuralNetworkloadAndRunPytorch+0)
01-15 16:08:11.391: A/DEBUG(13916):       #19 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #20 pc 0000000000511d1c  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #21 pc 000000000049774c  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+1840) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #22 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #23 pc 0000000000007d44  /data/data/com.android.contextq/code_cache/.overlay/base.apk/classes15.dex (com.android.contextq.neuralnetwork.NeuralNetworkService$NeuralNetworkServiceRunnable.run+0)
01-15 16:08:11.391: A/DEBUG(13916):       #24 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #25 pc 0000000000511d1c  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #26 pc 0000000000498288  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+4716) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #27 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #28 pc 000000000000308c  [anon:dalvik-/apex/com.android.art/javalib/core-oj.jar-transformed] (java.lang.Thread.run+0)
01-15 16:08:11.391: A/DEBUG(13916):       #29 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #30 pc 0000000000373a18  /apex/com.android.art/lib64/libart.so (artQuickToInterpreterBridge+964) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #31 pc 0000000000355968  /apex/com.android.art/lib64/libart.so (art_quick_to_interpreter_bridge+88) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #32 pc 000000000033eda4  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #33 pc 0000000000239d54  /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+144) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #34 pc 000000000053a1b0  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+1600) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #35 pc 00000000000ba650  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
01-15 16:08:11.391: A/DEBUG(13916):       #36 pc 0000000000053ffc  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)

adonnini avatar Jan 15 '24 21:01 adonnini

@SS-JIA in your comment above you state: "XNNPACK doesn't support dynamic shapes." yet, @cccclai in his comment above states: "In PyTorch Mobile, XNNPACK is pretty much like a default backend and it runs after we call optimized_for_mobile."

My model does use dynamic shapes, and I was able to run it for inference successfully from my Android application using the PyTorch Mobile (skipping the optimization step) runtime engine.

If I was able to run my model successfully using PyTorch Mobile because I skipped the optimization step, then why is there not a way to skip optimization when using Executorch? This would seem to be a reasonable option to have.

As far as I know, models with dynamic shapes are not the exception. How will it be possible (when?) to run models with dynamic shapes on Android devices using the Executorch runtime engine?

If the answer to both questions above is negative, Then it looks like I will not be able to use Executorch for my models. That would be really too bad.

Please let me know if I misunderstood your comment and if I am missing something.

Thanks

adonnini avatar Jan 16 '24 15:01 adonnini

@SS-JIA would it help if I sent you the .pte file produced when training my model using executorch?

Also, Here is a link to the model I use: https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master I modified it slightly to work with my dataset. I also added executorch code to train.py. I use train.py to produce the model. For the time being, I commetend out the validation code. If you like I can send you the modified train.py I used to produce the .pte file.

I hope you will have the time to let me know how I should proceed. Thanks

adonnini avatar Jan 18 '24 08:01 adonnini

@mcr229 can you take a look and see the dynamic shape support issue in xnnpack

kimishpatel avatar Jan 29 '24 19:01 kimishpatel

HI @adonnini XNNPACK Delegate currently can only support taking in inputs with static shapes. We are actively working on upstreaming dynamic shape support to XNNPACK, and once finished, we will be able to leverage this by updating our XNNPACK commit.

mcr229 avatar Feb 14 '24 17:02 mcr229

Thanks for the update. As far as you can tell at the moment, is it a matter of weeks before you will update the XNNPACK commit? Just so that I can plan accordingly. Thanks On Feb 14, 2024 6:08 PM, Max Ren @.***> wrote: HI @adonnini XNNPACK Delegate currently can only support taking in inputs with static shapes. We are actively working on upstreaming dynamic shape support to XNNPACK, and once finished, we will be able to leverage this by updating our XNNPACK commit.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

adonnini avatar Feb 14 '24 17:02 adonnini

We expect to have this ready within the next two weeks.

mcr229 avatar Feb 14 '24 17:02 mcr229

Thanks!On Feb 14, 2024 6:29 PM, Max Ren @.***> wrote: We expect to have this ready within the next two weeks.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

adonnini avatar Feb 14 '24 17:02 adonnini

@mcr229 I am still running into this issue. I am using the latest executorch release. Was the addition of dynamic shape support to XNNPACK completed and released? Thanks

adonnini avatar May 13 '24 16:05 adonnini

@mcr229 @kimishpatel I just received this response from @alankelly https://github.com/google/XNNPACK/issues/6423 regarding XNNPACK dynamic shape support.. My first question is how can I check that I am using the release of XNNPACK with dynamic shape support? @alankelly seems to think that the problem I have is with executorch. He may have a point since the error log (see below) does not mention XNNPACK as the error in this issue does (see above). As I mentioned previously, I was able to run this model for inference from my Android app lowering it using pytorch mobile (using torchscript)

What should I do next? Please let me know if you need any information.

Thanks

ERROR LOG

05-12 16:50:23.542: E/ExecuTorch(12402): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 14415
05-12 16:50:23.542: E/ExecuTorch(12402): Error setting input 0: 0x10
05-12 16:50:23.542: A/ExecuTorch(12402): In function execute_method(), assert failed (result.ok()): Execution of method forward failed with status 0x12
05-12 16:50:23.542: A/libc(12402): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 12434 (Thread-2), pid 12402 (lNetworkService)
05-12 16:50:23.597: I/crash_dump64(12837): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
05-12 16:50:23.598: I/tombstoned(712): received crash request for pid 12434
05-12 16:50:23.603: I/crash_dump64(12837): performing dump of process 12402 (target tid = 12434)
05-12 16:50:23.728: W/adbd(9671): timeout expired while flushing socket, closing
05-12 16:50:23.829: A/DEBUG(12837): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
05-12 16:50:23.829: A/DEBUG(12837): Build fingerprint: 'Fairphone/FP4eea/FP4:13/TKQ1.230127.002/TP2D:user/release-keys'
05-12 16:50:23.829: A/DEBUG(12837): Revision: '0'
05-12 16:50:23.829: A/DEBUG(12837): ABI: 'arm64'
05-12 16:50:23.829: A/DEBUG(12837): Timestamp: 2024-05-12 16:50:23.608361388+0200
05-12 16:50:23.829: A/DEBUG(12837): Process uptime: 377s
05-12 16:50:23.829: A/DEBUG(12837): Cmdline: com.android.contextq:ContextQNeuralNetworkService
05-12 16:50:23.829: A/DEBUG(12837): pid: 12402, tid: 12434, name: Thread-2  >>> com.android.contextq:ContextQNeuralNetworkService <<<
05-12 16:50:23.829: A/DEBUG(12837): uid: 10207
05-12 16:50:23.829: A/DEBUG(12837): signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
05-12 16:50:23.829: A/DEBUG(12837): Abort message: 'In function execute_method(), assert failed (result.ok()): Execution of method forward failed with status 0x12'
05-12 16:50:23.829: A/DEBUG(12837):     x0  0000000000000000  x1  0000000000003092  x2  0000000000000006  x3  0000007970a42e30
05-12 16:50:23.829: A/DEBUG(12837):     x4  72601f2b2827636e  x5  72601f2b2827636e  x6  72601f2b2827636e  x7  7f7f7f7f7f7f7f7f
05-12 16:50:23.829: A/DEBUG(12837):     x8  00000000000000f0  x9  0000007d0a45ab28  x10 0000000000000001  x11 0000007d0a49a84c
05-12 16:50:23.829: A/DEBUG(12837):     x12 0000007970a41400  x13 000000000000006f  x14 0000007970a42748  x15 0000000034155555
05-12 16:50:23.829: A/DEBUG(12837):     x16 0000007d0a502d68  x17 0000007d0a4de4e0  x18 000000796fac0000  x19 0000000000003072
05-12 16:50:23.829: A/DEBUG(12837):     x20 0000000000003092  x21 00000000ffffffff  x22 0000007cfd41da00  x23 0000007cfd41da00
05-12 16:50:23.829: A/DEBUG(12837):     x24 0000007970a435b0  x25 b400007b3371b560  x26 0000000000002072  x27 0000007cfd9143e8
05-12 16:50:23.829: A/DEBUG(12837):     x28 0000007970a43480  x29 0000007970a42eb0
05-12 16:50:23.829: A/DEBUG(12837):     lr  0000007d0a48b788  sp  0000007970a42e10  pc  0000007d0a48b7b4  pst 0000000000001000

adonnini avatar May 15 '24 15:05 adonnini

Please let me know on what I should do next in order to resolve this issue, and if you need any information. Thanks

adonnini avatar May 20 '24 05:05 adonnini

@mcr229 can you take a look?

kimishpatel avatar May 20 '24 13:05 kimishpatel

A quick update. As a backup/temporary solution, while working on resolving the issues with executorch, I used torchscript to produce a lowered model for use with the pytorch mobile runtime. The process worked. I was able to load the lowered model successfully. To a certain extent, this test seems to indicate that the current issues are not related to the model. This is not a solution. My goal is to use executorch. I really hope we can make some progress on resolving the current issues soon. Thanks

adonnini avatar May 20 '24 16:05 adonnini

The error indeed is not coming from XNNPACK, and it seems to be thrown within executorch. Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 14415 seems to suggest that the dynamic tensor has been marked as static. Was the model exported with dynamic shapes? I saw that you're using this model: https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master

do you also happen to have a code pointer of how you're exporting the model?

cc. @JacobSzwejbka @cccclai

mcr229 avatar May 23 '24 20:05 mcr229

@mcr229 Thanks for taking a look. Below, you will find the code I use to export the model. Is this what you are looking for? Please let me know if you need any other information. Thanks

CODE

pre_autograd_aten_dialect = capture_pre_autograd_graph(m, (enc_input, dec_input, dec_source_mask, dec_target_mask))
aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect, (enc_input, dec_input, dec_source_mask, dec_target_mask), strict=False)
edge_program: EdgeProgramManager = to_edge(aten_dialect)
to_be_lowered_module = edge_program.exported_program()

from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend

lowered_module = edge_program.to_backend(XnnpackPartitioner())

save_path = save_path = "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/models/tpt_delegate.pte"
with open(save_path, "wb") as f:
    f.write(lowered_module.to_executorch().buffer)

adonnini avatar May 24 '24 04:05 adonnini

hi @adonnini, when enabling dynamic shapes for the executorch model, you can specify the dynamic shapes when capturing the graph. Here is an example:

from torch.export import Dim


class Basic(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
        return x + y


f = Basic()
example_args = (torch.randn(3, 3), torch.randn(3, 3))
dim1_x = Dim("dim1_x", min=1, max=10)
dynamic_shapes = {"x": {1: dim1_x}, "y": {1: dim1_x}}
pre_autograd_aten_dialect = capture_pre_autograd_graph(
    f, example_args, dynamic_shapes=dynamic_shapes
)
aten_dialect: ExportedProgram = export(f, example_args, dynamic_shapes=dynamic_shapes)
print("ATen Dialect Graph")
print(aten_dialect)

afterwards you can follow the same flow of to_edge --> to_backend --> f.write().

mcr229 avatar May 24 '24 16:05 mcr229

@mcr229 Thanks! I'll do as you suggest. The one thing that gives me a little pause is the underlying assumption in what you suggest that I would know a priori the maximum dimension. I guess that as a (temporary) work around I can make an educated guess. However, I am concerned that in a production environment this would not work. Is the idea that I should give max a value high enough to cover the vast majority of cases? For example, could/should I give max a value equal to 1000000? Thanks

adonnini avatar May 24 '24 20:05 adonnini

@adonnini I believe ExecuTorch does upper bounded memory planning and I know that XNNPACK delegate does as well. While I'm not entirely sure how executorch will do with very large max values with respect to memory planning. The XNNPACK delegated portions will use the upper bound to do initial memory planning. XNNPACK will actually be able to go above the maximum value, however this will come at the cost of some performance as we will reallocate memory for the new larger amount at that inference. I fear using too large a maximum value for XNNPACK as it may throw errors with not enough memory as it tries to allocate memory for extremely large intermediate tensors. So I would say to try to put the most realistic maximum tensor size.

cc. @JacobSzwejbka, @cccclai, @larryliu0820 for the dynamic memory planning

mcr229 avatar May 24 '24 20:05 mcr229

@mcr229 after adding the dynamic shapes code, execution failed producing the traceback log reported below. For your reference, below you will also find the code that produced the failure.

The executorch related code is inserted in the training epoch loop. It runs after execution of a training step which based on the logs ran successfully. I am pointing this out because I find this line in the traceback log puzzling (Not surprising):

torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['enc_input'].size()[1] = 7 is not equal to L['dec_input'].size()[1] = 12

enc_input and dec_input in the model can be but are not expected to be equal.

Here is a print of their shapes:

 - train_minimum - Lowering the Whole Module - enc_input.shape -  torch.Size([27, 7, 2])
 - train_minimum - Lowering the Whole Module - dec_input.shape -  torch.Size([27, 12, 3])
 - train_minimum - Lowering the Whole Module - dec_source_mask.shape -  torch.Size([27, 1, 7])
 - train_minimum - Lowering the Whole Module - dec_target_mask.shape -  torch.Size([27, 12, 12])

Probably, I just don't understand the error statement.

Please let me know what I should do next, and if you need any additional information.

Thanks

TRACEBACK LOG

E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Error while creating guard:
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Name: ''
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Source: shape_env
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Create Function: SHAPE_ENV
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Guard Types: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Code List: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Object Weakref: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Guarded Class Weakref: None
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] Created at:
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 482, in transform
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     tracer = InstructionTranslator(
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2060, in __init__
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     output=OutputGraph(
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 310, in __init__
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     self.init_ambient_guards()
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 421, in init_ambient_guards
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     self.guards.add(ShapeEnvSource().make_guard(GuardBuilder.SHAPE_ENV))
  0%|                                                                                                                                              | 0/5 [00:25<?, ?it/s]
Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train-minimum.py", line 438, in <module>
    pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/__init__.py", line 151, in capture_pre_autograd_graph
    m = torch._dynamo.export(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1354, in inner
    raise constraint_violation_error
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1311, in inner
    result_traced = opt_f(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
    return _compile(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 634, in compile_inner
    check_fn = CheckFunctionManager(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/guards.py", line 1048, in __init__
    guard.create(builder)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_guards.py", line 249, in create
    return self.create_fn(builder, self)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/guards.py", line 705, in SHAPE_ENV
    guards = output_graph.shape_env.produce_guards(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2946, in produce_guards
    raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['enc_input'].size()[1] = 7 is not equal to L['dec_input'].size()[1] = 12

CODE


        dim1_x = Dim("dim1_x", min=1, max=100000)
        dynamic_shapes = {"enc_input": {1: dim1_x}, "dec_input": {1: dim1_x}, "dec_source_mask": {1: dim1_x}, "dec_target_mask": {1: dim1_x}}

        pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
                                                               (enc_input, dec_input, dec_source_mask, dec_target_mask), dynamic_shapes=dynamic_shapes)
        aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect,

        print(" - train_minimum - Lowering the Whole Module - ATen Dialect Graph")
        print(" - train_minimum - Lowering the Whole Module - aten_dialect - ", aten_dialect)

        edge_program: EdgeProgramManager = to_edge(aten_dialect)
        to_be_lowered_module = edge_program.exported_program()

        from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend
        lowered_module = edge_program.to_backend(XnnpackPartitioner())

        print(" - train_minimum - Lowering the Whole Module - lowered_module - ", lowered_module)

        save_path = save_path = "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/models/tpt_delegate.pte"
        with open(save_path, "wb") as f:
            f.write(lowered_module.to_executorch().buffer)

adonnini avatar May 26 '24 06:05 adonnini

@adonnini the statement means that a guard was generated during export that checks to ensure that L['enc_input'].size()[1] == L['dec_input'].size()[1]. In the dynamic range you have provided this constraint is being violated when L['enc_input'].size()[1] is 7 and L['dec_input'].size()[1] is 12. You can enable detailed logging to see which line of model source code generated this guard so that we can maybe potentially change the code or change the constraint range. To do this you can add to the top of the export script this code:

os.environ["TORCH_LOGS"] = "+dynamo"
torch._logging._init_logs()

In the logs search for "guard added" and you should be able to see which line of model source code generated this guard.

tarun292 avatar May 28 '24 15:05 tarun292

@tarun292 I will do as you ask and let you know what I find. However, I am puzzled/perplexed by this statement:

guard was generated during export that checks to ensure that L['enc_input'].size()[1] == L['dec_input'].size()[1]

I don't understand why this check would be done/enabled in the first place since, unless I am mistaken, it is not the case that enc_input'].size()[1] and ['dec_input'].size()[1] be equal. Where does that requirement come from? What am I missing / doing wrong?

Thanks

adonnini avatar May 28 '24 19:05 adonnini

@adonnini before or after (near this log) there should be another print indicating which source line generated this guard. Are you able to see that?

tarun292 avatar May 28 '24 19:05 tarun292