Resources-Allocation-in-The-Edge-Computing-Environment-Using-Reinforcement-Learning icon indicating copy to clipboard operation
Resources-Allocation-in-The-Edge-Computing-Environment-Using-Reinforcement-Learning copied to clipboard

Convergence - the graph

Open HuongDM1896 opened this issue 3 years ago • 4 comments

Morning, David. Thank you a lot for sharing the code. But I have a question about the results (the trend of the graph). We also know in DDPG, our target is to find the best reward, the reward increases gradually until convergence – the value of reward bouncing in the same value, not continuing to increase. But your results didn't show that. Could you let me know what is your standard for the stop point? when does training processing stop? One more time, thank you so much for your help. Actually, your code helps me a lot with my study.

HuongDM1896 avatar Feb 23 '22 01:02 HuongDM1896

Hello, I would like to know this too, can we discuss it together?

Sanmu123-lab avatar Apr 17 '22 05:04 Sanmu123-lab

Hello, I also have the same question about the convergence of rewards gragh. Have you solved this problem already? I will apreciate it if you can offer an anwser for me.

Jadaeu avatar Jul 23 '23 08:07 Jadaeu

@Sanmu123-lab @Jadaeu Hi, I solved this problem, Actually, it is not a problem, the graph trending is good, just only not convergence yet. First, we should know when the training process is stopped. The answer they check the MSE in the last 10 episodes. This value is defined by LEARNING_MAX_EPISODE. Now I just make it extend to 100,200,..., and then to make the training process faster, you can reduce the MAX_EP_STEPS to 500, 1000.

HuongDM1896 avatar Aug 31 '23 08:08 HuongDM1896

@Sanmu123-lab @Jadaeu Hi, I solved this problem, Actually, it is not a problem, the graph trending is good, just only not convergence yet. First, we should know when the training process is stopped. The answer they check the MSE in the last 10 episodes. This value is defined by LEARNING_MAX_EPISODE. Now I just make it extend to 100,200,..., and then to make the training process faster, you can reduce the MAX_EP_STEPS to 500, 1000.

Alright, thank you!

Jadaeu avatar Sep 01 '23 08:09 Jadaeu