Vitis-Tutorials
Vitis-Tutorials copied to clipboard
Multiple AIE kernels in same .cc file
Hi,
I was looking at the LeNet AIE tutorial and have the below queries Each kernel is defined in a separate .cc file
- What is the motivation behind it? Is this to manage the program memory efficiently in a Tile? Is it because ,since each tile will have its own program memory which it does not share with other tiles,so having a separate kernel in .cc file will help occupy the program memory which is needed only for that Tile. Let me know if my understanding is correct or if is there any other reason. Note: I tried the simple application example from link with simple,simple1 kernels defined in the same kernels.cc file and it still works.
Multiple tiles for multiple kernels
- Is there any reason for assigning a different tile to a different kernel? Since the network executes sequentially one layer after the other running all the kernels in the same tile should not make any difference wrt performance of the entire network.
- Do we have any latency in data access between tiles? If yes, Does it make sense to assign multiple kernels to same tile to reduce this latency?
Multiple instances of same kernel which has runtime parameters
- In deep learning applications a network has 100 layers in which 50 are convolution layer. Do we need to create 50 separate convolution layer kernels which perform a similar operations with different weights and dimensions? Is there an option/example where the same kernel can be assigned to different tiles which can take different run-time parameters(weights/bias, dimensions) to avoid the creation of multiple .cc files/kernels?
Clarifying the above queries will help us understand more about the AIE workflows and help us deploy better to AIE.
Thanks in advance, Praveen.
Hi @praveen0447
Note: This is not specifically related to the Lenet example. For similar questions, please use the forums: https://support.xilinx.com/s/?language=en_US
_Each kernel is defined in a separate .cc file
What is the motivation behind it? Is this to manage the program memory efficiently in a Tile? Is it because ,since each tile will have its own program memory which it does not share with other tiles,so having a separate kernel in .cc file will help occupy the program memory which is needed only for that Tile. Let me know if my understanding is correct or if is there any other reason. Note: I tried the simple application example from link with simple,simple1 kernels defined in the same kernels.cc file and it still works. Multiple tiles for multiple kernels_
> This just depends on how you want to organise your files. Having one kernel per file is easier to find the kernel back. But you can also have multiple kernels per source files. This is up to you, there is no specific guidance on this
Is there any reason for assigning a different tile to a different kernel? Since the network executes sequentially one layer after the other running all the kernels in the same tile should not make any difference wrt performance of the entire network. Do we have any latency in data access between tiles? If yes, Does it make sense to assign multiple kernels to same tile to reduce this latency? Multiple instances of same kernel which has runtime parameters
>You can have more performances by duplicating the kernels. If your goal is resources optimisation then you can have less kernels and reuse the tiles. One other thing you need to be careful as well is the dataflow. You need to make sure you can move the data efficiently, so sometimes using multiple tiles might be more resource efficient.
In deep learning applications a network has 100 layers in which 50 are convolution layer. Do we need to create 50 separate convolution layer kernels which perform a similar operations with different weights and dimensions? Is there an option/example where the same kernel can be assigned to different tiles which can take different run-time parameters(weights/bias, dimensions) to avoid the creation of multiple .cc files/kernels? > I am not sure there is an example but this is doable. You can load the weight using the memories or using run time parameters (refer to UG1079)
Hi @xflorentw Thanks for reverting back.
Regarding > But you can also have multiple kernels per source files. This is up to you, there is no specific guidance on this
If I create multiple kernels in a source file and assign separate kernel per tile. Will program memory per tile include all the kernels program memory, since we are compiling same source file for all the tiles?
Or
Is this program memory distribution handled internally by compiler ?
Reason for asking this question is to get more clarity on the program memory distribution and how to better optimize it.
Thanks in advance.
@praveen0447
My expectation is that this will not impact the program memory. You should be able to verify this quickly