nasnet icon indicating copy to clipboard operation
nasnet copied to clipboard

What the step means in controller

Open NLGithubWP opened this issue 2 years ago • 5 comments

What does the step mean in the controller?

for node in range(self.steps):

I think it's a number of blocks in each cell. In the original paper, there are 5 blocks in either normal cell or reduction cell.

why it is 4?

NLGithubWP avatar Apr 29 '22 13:04 NLGithubWP

I have the same question. And also about the outer loop "for type in range(2):". Do you know that was outer loop is used for? I guess "type" here means two types of cells, but I am not very sure.

qwangku avatar Dec 20 '22 21:12 qwangku

Thanks for your interest in this repo! Actually, I haven't done NAS for a long time and missed some details of this code. A1:"self.steps" corresponds to the node number in each cell, I employed the Cell Space of DARTS. A2:Yes, "range(2)" means two types of cells, Normal Cell and Reduction Cell.

MarSaKi avatar Dec 21 '22 09:12 MarSaKi

@MarSaKi Thanks for sharing this great material. Another question about the losses, I am reading the cal_loss() implementation in policy_gradient.py and PPO.py. I am a beginner to the field of reinforcement learning, and I didn't find the loss related details from the NASNet paper. Could you please point me to the original formulas which your cal_loss() functions were based on?

For example, I am not sure what "entropy loss" means here and the implementation of policy loss seems more advanced compared to the vanilla ones from my textbook : )

qwangku avatar Dec 21 '22 18:12 qwangku

@MarSaKi Thanks for sharing this great material. Another question about the losses, I am reading the cal_loss() implementation in policy_gradient.py and PPO.py. I am a beginner to the field of reinforcement learning, and I didn't find the loss related details from the NASNet paper. Could you please point me to the original formulas which your cal_loss() functions were based on?

For example, I am not sure what "entropy loss" means here and the implementation of policy loss seems more advanced compared to the vanilla ones from my textbook : )

To calculate the loss, I use the Clipping version of PPO. You can refer to Sec.6.1 of the PPO paper. "entropy loss" is just a typical regularization term in RL to avoid policy overfitting. Maybe you can refer to this course.

MarSaKi avatar Dec 22 '22 04:12 MarSaKi

What are the node decoders for? Why are they having different output dimensions?

hardikk65 avatar May 20 '24 21:05 hardikk65