ai8x-synthesis
ai8x-synthesis copied to clipboard
processors
hi I have trouble with the logic of processors in yaml file for example HWC (little data) configuration for CIFAR-100 Simple Model
arch: ai85ressimplenet dataset: CIFAR100
layers: Layer 0
out_offset: 0x2000
processors: 0x7000000000000000
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU
data_format: HWC
Layer 1
out_offset: 0x0000
processors: 0x0ffff00000000000
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU
Layer 2 - re-form data with gap
out_offset: 0x2000
processors: 0x00000000000fffff
output_processors: 0x00000000000fffff
operation: passthrough
write_gap: 1
Layer 3
in_offset: 0x0000
in_sequences: 1
out_offset: 0x2004
processors: 0x00000000000fffff
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU
write_gap: 1
Layer 4 - Residual-1
in_sequences: [2, 3]
in_offset: 0x2000
out_offset: 0x0000
processors: 0x00000000000fffff
eltwise: add
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU
Layer 5
out_offset: 0x2000
processors: 0xfffff00000000000
output_processors: 0x000000fffff00000
max_pool: 2
pool_stride: 2
pad: 1
operation: conv2d
kernel_size: 3x3
activate: ReLU
" why the sample doesn't start to turn on the processors from the first(right)??what is the logic behind this??
Why the sample doesn't start to turn on the processors from the first(right)??what is the logic behind this??
The reason to distribute the processors is memory allocation. It's not needed for all networks, but when you look at the "Kernel map" at the top of the synthesis log.txt for your example, you can see how kernel memory is associated with processors. The data memory is also associated with processors. When several layers uses fewer than "all" processors, it can be useful for resource allocation to distribute the processors used.
Thanks How does developer understand to set the mapping of processors in each layer of model?
On Mon, 17 Jun 2024, 17:22 Robert Muchsel, @.***> wrote:
Why the sample doesn't start to turn on the processors from the first(right)??what is the logic behind this??
The reason to distribute the processors is memory allocation. It's not needed for all networks, but when you look at the "Kernel map" at the top of the synthesis log.txt for your example, you can see how kernel memory is associated with processors. The data memory is also associated with processors. When several layers uses fewer than "all" processors, it can be useful for resource allocation to distribute the processors used.
— Reply to this email directly, view it on GitHub https://github.com/analogdevicesinc/ai8x-synthesis/issues/344#issuecomment-2173452733, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWLZTJKRAICR3MZ6QUMDEMDZH3SYPAVCNFSM6AAAAABJMVNJCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZTGQ2TENZTGM . You are receiving this because you authored the thread.Message ID: @.***>
The number of processors required is set by the number of channels in the layer. The exact selection of which processors to use for a layer is not fixed and there may be several valid answers for any given layer. Ultimately, you may want to modify the processor assignment to optimize resource allocation.
hi thanks for reply yes there are several valid answers but in the samples I have seen for some models like ai85net5 which has 5 layers in yaml file most of the processors are on but there are only 3 channel which used and I don't understand why some unused processors are on .. for example arch: ai85net5 dataset: MNIST
Define layer parameters in order of the layer sequence
layers:
- pad: 1 activate: ReLU out_offset: 0x2000 processors: 0x0000000000000001 data_format: CHW op: conv2d
- max_pool: 2 pool_stride: 2 pad: 2 activate: ReLU out_offset: 0 processors: 0xfffffffffffffff0 op: conv2d
- max_pool: 2 pool_stride: 2 pad: 1 activate: ReLU out_offset: 0x2000 processors: 0xfffffffffffffff0 op: conv2d
- avg_pool: 2 pool_stride: 2 pad: 1 activate: ReLU out_offset: 0 processors: 0x0ffffffffffffff0 op: conv2d
- op: mlp flatten: true out_offset: 0x1000 output_width: 32 processors: 0x0000000000000fff activate: None " I highly appreciate if you help me for understanding it
The number of processors are set according to the number of the input channels of the layer. AI85Net5 model is defined here and as you can observe there are 4 convolutional and 1 linear layers. The input channels of the convolutional layers are 1, 60, 60, 56 this is why in the yaml file sets 1 (0x0000000000000001), 60 (0xfffffffffffffff0), 60 (0xfffffffffffffff0) and 56 (0x0ffffffffffffff0) processors. The linear layer is ran after flattening the input and it has 12 length features. So, the linear layer uses 12 (0x0000000000000fff) processors.
hi thanks a lot for your reply now I understand completely
hi I have max78000fthr board and when I changed the mapping or training model in implementation then synthesize the checkpoints and yaml network for running on the board the inference time won't be changed and is same as the base example code what is the problem? I am running them in VS code same as video I have been patiently waiting for your answer.
Did you change the model architecture? If the architecture is same, it is expected to measure similar inference durations.
yes I changed the model and train it then with best checkpoint file synthesis it. also other time I changed the yaml network processor and approximate inference time didn't change too. is the problem related to board update? should I update it again according to setting?
As I understand you check the inference time from the command line outputs, not from the EvKit. I do not think it is about the board update. But please be sure the SDK is updated. If you see similar outputs, it is better if you can share the files and command to synthesize the C code.
that's right I check the output from the terminal in VS code and I work with feather board. how can I send you my file?
Hi, please zip the following files and you may share them with us by uploading them to a cloud service or github. 1- Model.py (your modified model file) 2- Network.yaml 3- Checkpoints( Both quantized and unquantized)
hi,thanks for your reply how can I send you in private?
You may send the link to [email protected]
This issue has been marked stale because it has been open for over 30 days with no activity. It will be closed automatically in 10 days unless a comment is added or the "Stale" label is removed.