yolov9 icon indicating copy to clipboard operation
yolov9 copied to clipboard

where is auxiliary information

Open ccblublu opened this issue 11 months ago • 23 comments

thanks for your excellent job, I still have a question about multi-level auxiliary information in your paper 4.1.2 however i cannot find where it used in your code, i just find 6 heads in main branch and aux branch as shown in .yaml

ccblublu avatar Mar 07 '24 12:03 ccblublu

@WongKinYiu

ccblublu avatar Mar 07 '24 12:03 ccblublu

https://github.com/WongKinYiu/yolov9/blob/main/models/detect/yolov9-c.yaml#L81-L116 https://github.com/WongKinYiu/yolov9/blob/main/models/detect/yolov9-e.yaml#L88-L105

WongKinYiu avatar Mar 07 '24 13:03 WongKinYiu

thanks for your reply, in my opinion, it is aux branch,revealed by the grey area in fig,where is the pink area? @WongKinYiu PixPin_2024-03-08_09-50-28

ccblublu avatar Mar 08 '24 01:03 ccblublu

https://github.com/WongKinYiu/yolov9/blob/main/models/detect/yolov9-e.yaml#L88-L105

"multi-level auxiliary branch" is pink area.

sanha9999 avatar Mar 08 '24 06:03 sanha9999

there is 3+3+3, 9 heads in fig, but just 6 heads in yaml, which make me confused, is there any details i missed? and as shown in yolov9-e.yaml, the relationship between main branch and aux branch is different with fig in paper,Looking forward to your relpy! @WongKinYiu @sanha9999 4e94bb69ec3372213726b0f69a370cde

ccblublu avatar Mar 08 '24 10:03 ccblublu

I have this question too. I already have found the multi-level auxiliary branch in yaml files, but, I have not find the Auxiliary Reversible Branch. And, there are nine prediction heads in the paper's figure 3, but only 6 in the yaml files. Why is this?

YoohLee avatar Mar 09 '24 08:03 YoohLee

there is 3+3+3, 9 heads in fig, but just 6 heads in yaml, which make me confused, is there any details i missed? and as shown in yolov9-e.yaml, the relationship between main branch and aux branch is different with fig in paper,Looking forward to your relpy! @WongKinYiu @sanha9999 4e94bb69ec3372213726b0f69a370cde

I have the same question. There are only six heads in the model in the yaml file of yolo9-e. I am confused. If anyone can understand, please tell me.

wgqhandsome avatar Mar 09 '24 14:03 wgqhandsome

You could take a look Table 4, if you have aux branches on both backbone and neck, you will have 3+3+3 heads. And you could use train_triple.py to train the model.

WongKinYiu avatar Mar 10 '24 09:03 WongKinYiu

If I want to have 9 heads, I need to select the corresponding YAML file. Based on the open-source code, I need to add the corresponding layers as inputs for the prediction head. From my observation, the main differences between train_*.py s are in the loss function and the selection of the detection head. Am I correct in my understanding? @WongKinYiu

ccblublu avatar Mar 11 '24 02:03 ccblublu

I have this question too. I already have found the multi-level auxiliary branch in yaml files, but, I have not find the Auxiliary Reversible Branch. And, there are nine prediction heads in the paper's figure 3, but only 6 in the yaml files. Why is this?

i think the Reversible structure is multi-level connections and fuse

ccblublu avatar Mar 11 '24 02:03 ccblublu

@WongKinYiu Could we have a diagram or explanation of what an auxiliary branch does in terms of operations/transformations, maybe some links to the code as well?

It's clear that it serves the purpose of preserving the input-target relation throughout the layers, in a parallel branch, but said like that is kinda black magic to me.

The same applies to the multi level auxiliary branch and predictions heads: how they are plugged in the rest of the network?

masc-it avatar Mar 17 '24 18:03 masc-it

Which part of the source code specifically implements the operation of the reversible auxiliary branch? There are still doubts about the implementation of this part of the function. Can you explain the specific code? @WongKinYiu

JaneM1222 avatar Mar 19 '24 02:03 JaneM1222

there is 3+3+3, 9 heads in fig, but just 6 heads in yaml, which make me confused, is there any details i missed? and as shown in yolov9-e.yaml, the relationship between main branch and aux branch is different with fig in paper,Looking forward to your relpy!

image

The main branch of yolov9-e contains reversible architecture. The architecture of yolov9-e is modify from dynamic-yolov7.

WongKinYiu avatar Mar 20 '24 01:03 WongKinYiu

Which part of the source code specifically implements the operation of the reversible auxiliary branch? There are still doubts about the implementation of this part of the function. Can you explain the specific code?

https://github.com/WongKinYiu/yolov9/blob/main/models/detect/yolov9-c.yaml#L81-L116

WongKinYiu avatar Mar 20 '24 01:03 WongKinYiu

A lack of proper explanation and understanding won't let this model have the attention it deserves. Please, consider writing a technical post or something where you explain in a more detailed view where, how and why things are placed in a certain way and why they work. Pointing to a Yaml it s of no help.

Also, the yolov5 codebase you're based on builds the model arch at runtime and that doesn't help to visualize the flow. A class with all the layers would help as well.

I'm a scientist like you, but please be more careful to code quality and proper documentation. That stuff makes the difference. kudos for your next works

masc-it avatar Mar 20 '24 06:03 masc-it

  1. Original deep networks will lost information when feed forward.

image

  1. Especially loss the information for making projection from data to target.

image

  1. We can visualize that the information lost make we can not find correct relation to project data to target.

image

  1. And modern networks could maintain more reliable information.

image

  1. Theoretically, reversible architecture could make deeper networks to maintain most of information of data.

image

  1. We also show the evidence that reversible branch make more trustworthy relation between data and target.

image

  1. And the corresponding part of reversible architecture is here.

https://github.com/WongKinYiu/yolov9/blob/main/models/detect/yolov9-c.yaml#L81-L116

WongKinYiu avatar Mar 20 '24 08:03 WongKinYiu

thanks, I've a couple of questions:

  1. How did you find that gelan actually satisfied the "reversible" property? Just by looking at feature maps? What's behind elan in general that makes it better than CSP and alikes ?
  2. What's the role of CBLinear from CBNet?
  3. Have you done ablations of some sort?

masc-it avatar Mar 20 '24 13:03 masc-it

no, gelan does not satisfied the reversible property, but cbnet and revcol do. for analysis of elan and csp, please take a look elan paper.

cblinear is just a set of linear layers to make different pyramidal feature maps to have same channel. the main part which contains reversible property is cbfuse which composite higher level features of the first backbone into lower level of features of the second backbone.

we have examine different sorts composite ways in cbnet, the best performance is conduct by fully dhlc and fcc. however, because we may add reversible auxiliary branch on different position of gelan-pan architecture. for fair comparison, we only make dhlc composition on p3/p4/p5 levels. the related ablations are shown in table 4.

WongKinYiu avatar Mar 20 '24 16:03 WongKinYiu

no, gelan does not satisfied the reversible property, but cbnet and revcol do. for analysis of elan and csp, please take a look elan paper.

cblinear is just a set of linear layers to make different pyramidal feature maps to have same channel. the main part which contains reversible property is cbfuse which composite higher level features of the first backbone into lower level of features of the second backbone.

we have examine different sorts composite ways in cbnet, the best performance is conduct by fully dhlc and fcc. however, because we may add reversible auxiliary branch on different position of gelan-pan architecture. for fair comparison, we only make dhlc composition on p3/p4/p5 levels. the related ablations are shown in table 4.

Can I interpret the nearest sample and Add operation in CBFuse as a reversible structure ? or does it mean that the structure of fusing high-level information of the first backbone into the second backbone is reversible ? In other words, if replacing CBFuse with Concat still a reversible structure?

Muyyong avatar Mar 29 '24 02:03 Muyyong

Concat is OK

WongKinYiu avatar Mar 29 '24 02:03 WongKinYiu

there is 3+3+3, 9 heads in fig, but just 6 heads in yaml, which make me confused, is there any details i missed? and as shown in yolov9-e.yaml, the relationship between main branch and aux branch is different with fig in paper,Looking forward to your relpy!

image

The main branch of yolov9-e contains reversible architecture. The architecture of yolov9-e is modify from dynamic-yolov7.

Hi @WongKinYiu , I do have some questions about the PGI mechanism. There are 2 differences between Figure 4 in the paper and the figure above:

  1. The forward path is different. In the yaml, the input number of the paths from the same level would be only one but two in the paper. Why?
  2. If I am not misunderstanding, the mechanism in yolov9-e. yaml is the auxiliary reversible branch due to the multi-column/multi-backbone. Why doesn't the reparameterization. ipynb remove the branch, or why is the branch regarded as a part of the main branch?

discipleofhamilton avatar Apr 26 '24 06:04 discipleofhamilton