autodeeplab icon indicating copy to clipboard operation
autodeeplab copied to clipboard

Congratulations for greatly reproducing

Open HeathHose opened this issue 5 years ago • 12 comments

Thanks for your greate AutoML! Could you pls release the architect found in search?

HeathHose avatar Nov 15 '19 07:11 HeathHose

Is search config used in obtain_search_args? And I found that AutoML is a little consistent with official_deeplab as follows

    if prev_layer is None:
      prev_layer = net

AutoML is as follows:

if prev_prev_fmultiplier == -1 and j == 0:
    op = None
  • AutoML ignored s0 which some corner cell don't has while official_deeplab copy s1 as s0
  • Well, tensorflow code is only for test the architect, not search architect. But I think the above code is not written casually.

So I email the author and the reply is as follows:

For those cells, I instead used the tensor with the most similar spatial size. Typically this means 4x larger tensor for l-2, and 2x larger tensor for l-1. More specifically, when 8H1 preprocesses stem0, it will use stride 4, and when it preprocesses stem1, it will use stride 2.

Yuan Fang, how do you see?

HeathHose avatar Nov 15 '19 08:11 HeathHose

Thanks, We will fix it just like the author reports!

zhizhangxian avatar Nov 18 '19 12:11 zhizhangxian

Is search config used in obtain_search_args? And I found that AutoML is a little consistent with official_deeplab as follows

    if prev_layer is None:
      prev_layer = net

AutoML is as follows:

if prev_prev_fmultiplier == -1 and j == 0:
    op = None
  • AutoML ignored s0 which some corner cell don't has while official_deeplab copy s1 as s0
  • Well, tensorflow code is only for test the architect, not search architect. But I think the above code is not written casually.

So I email the author and the reply is as follows:

For those cells, I instead used the tensor with the most similar spatial size. Typically this means 4x larger tensor for l-2, and 2x larger tensor for l-1. More specifically, when 8H1 preprocesses stem0, it will use stride 4, and when it preprocesses stem1, it will use stride 2.

Yuan Fang, how do you see?

@HeathHose Nice reply from authors, but what the authors replied confused me a lot. Could you give me an example to process the case of when prev_prev_c (s0) is None? What does use the tensor with the most similar spatial size mean, or just use the copy of prev_c (s1)? Thanks in advance!

mrluin avatar Nov 19 '19 08:11 mrluin

@mrluin for example, 8H1 is the first node in level 8 so that it don't have s0 and s1. So the author preprocess stem0 and stem1 in stride 2,4 as s0 and s1 respectively

HeathHose avatar Nov 19 '19 09:11 HeathHose

@mrluin hi, I've bounced back and forth about how is best to do this. The authors doing it this way doesn't automatically make it the best approach. I think there may be a better way. However, It wont be implemented soon, and if you think you can improve on our results please try to do so. Should you get better results I would love for you to make a pull request and become a contributer to the project.

NoamRosenberg avatar Nov 19 '19 10:11 NoamRosenberg

@mrluin hi, I've bounced back and forth about how is best to do this. The authors doing it this way doesn't automatically make it the best approach. I think there may be a better way. However, It wont be implemented soon, and if you think you can improve on our results please try to do so. Should you get better results I would love for you to make a pull request and become a contributer to the project.

I'm getting on it and see how the result will be. I'll report the result here either it gets better or not.

HankKung avatar Nov 19 '19 15:11 HankKung

螢幕快照 2019-11-27 上午7 49 43

architecture search results: [1 1 2 2 2 3 3 2 1 2 3 3] new cell structure: [[ 0 4] [ 1 4] [ 4 5] [ 2 4] [ 7 5] [ 6 5] [13 5] [ 9 4] [17 5] [18 4]]

This is the one I've added pre_pre_input for those edge tensors. Despite the mIoU slightly decreases, the stem has only two layers (to keep consistent with the paper). In addition, all downsampling are implemented with stride 2 and 4.

(edited)As we can see that the search had not converged yet because the cell structures derived are all sep conv, not even one delated conv, which shouldn't be. But I think padding pre_pre_input for those edge tensors is necessary.

Searching with a larger epochs number is a straightforward option. An alternative is to boost the robustness of the search, I recently found some interesting research on it: https://openreview.net/pdf?id=H1gDNyrKDS.

HankKung avatar Nov 27 '19 00:11 HankKung

@HankKung Hi, Thanks for your hard work!

This is the one I've added pre_pre_input for those edge tensors.

Does the way you add pre_pre_input for those edge tensors like the following one? And how do you perform downsampling with stride4, also use the FactorizedReduce?

add pre_pre_input for the first nodes in each level (4, 8, 16, 32)
-------------------------------------------
level-node | pre_pre_input | pre_input
-------------------------------------------
4-2        output of stem0  output of stem1
8-1        output of stem0  output of stem1
8-2        output of stem1  output of 8-1
16-1       output of stem1  output of 8-1
16-2       output of 8-1    output of 16-1  
32-1       output of 8-1    output of 16-1
32-2       output of 16-1   output of 32-1

But after I read the official code of autodeeplab (derived model), I found that the stem has three conv_layers rather than two.

And how do you implement the derived model, the same as the official code? (if pre_pre_input is None, consider it as a copy of the pre_input)

Looking forward to your reply!

mrluin avatar Dec 02 '19 12:12 mrluin

@HankKung Hi, Thanks for your hard work!

This is the one I've added pre_pre_input for those edge tensors.

Does the way you add pre_pre_input for those edge tensors like the following one? And how do you perform downsampling with stride4, also use the FactorizedReduce?

add pre_pre_input for the first nodes in each level (4, 8, 16, 32)
-------------------------------------------
level-node | pre_pre_input | pre_input
-------------------------------------------
4-2        output of stem0  output of stem1
8-1        output of stem0  output of stem1
8-2        output of stem1  output of 8-1
16-1       output of stem1  output of 8-1
16-2       output of 8-1    output of 16-1  
32-1       output of 8-1    output of 16-1
32-2       output of 16-1   output of 32-1

But after I read the official code of autodeeplab (derived model), I found that the stem has three conv_layers rather than two.

And how do you implement the derived model, the same as the official code? (if pre_pre_input is None, consider it as a copy of the pre_input)

Looking forward to your reply!

About the stem, they used a two-layer one during the search and used a three-layer one during the weights training (retrain) as you mentioned. Yes, I used FactorizedReduce and DoubleFactorizedReduce for stride 2 and 4 respectively. 螢幕快照 2019-12-02 下午8 25 36 Since aiming to consistent with the paper, the stem I implemented is only with 2-layer, therefore, the pre_pre_inputs for the nodes are almost the same as the one you described. The only difference is I also add the output of stem0 for node 4-1

I haven't evaluated the performance of the derived model, but it's supposed the pre_pre_inputs always exist because the pre_pre_input is simply the last two node's output (e.g., stem2's output as level 1's pre_pre_input), no need for the same spatial resolution one.

Glad to help! If you have any idea or questions, we are happy to discuss.

HankKung avatar Dec 02 '19 12:12 HankKung

Oh, that's very helpful! I made a mistake in the above list. In the case of the two-layer stem:

4-1 output of stem0, output of stem1
4-2 output of stem1, output of 4-1

Now, I think I got your idea of this repetition. Thank you very much!

mrluin avatar Dec 02 '19 13:12 mrluin

@HankKung Thank you for providing the clarification about the network. I am wondering what does each number mean in the output of cell structure? or how we can print(genotype) instead of decode? For example, here is the output of the new cell structure (genotype), what does each number mean? Greatly appreciated if you could provide some help.

new cell structure: [[ 1  5]
 [ 0  4]
 [ 2  4]
 [ 3  0]
 [ 5  4]
 [ 7  4]
 [11  7]
 [12  4]
 [17  4]
 [18  2]]

NdaAzr avatar May 05 '20 11:05 NdaAzr

@HankKung Thank you for providing the clarification about the network. I am wondering what does each number mean in the output of cell structure? or how we can print(genotype) instead of decode? For example, here is the output of the new cell structure (genotype), what does each number mean? Greatly appreciated if you could provide some help.

new cell structure: [[ 1  5]
 [ 0  4]
 [ 2  4]
 [ 3  0]
 [ 5  4]
 [ 7  4]
 [11  7]
 [12  4]
 [17  4]
 [18  2]]

hey,I have the same problem with you ,Have you found the answer?

sdszqs avatar Mar 19 '21 03:03 sdszqs