Optimal-Energy-System-Scheduling-Combining-Mixed-Integer-Programming-and-Deep-Reinforcement-Learning icon indicating copy to clipboard operation
Optimal-Energy-System-Scheduling-Combining-Mixed-Integer-Programming-and-Deep-Reinforcement-Learning copied to clipboard

Some question about the MIP_DQN.py

Open Mrdawnzero opened this issue 2 years ago • 8 comments

Hello, I'd like to understand how to use the "Actor_MIP" class in the provided code. This part is mentioned as a highlight in your paper, but it seems that the class is not called or utilized in the code. I'm interested in learning how this class should be employed.

Mrdawnzero avatar Oct 25 '23 07:10 Mrdawnzero

Same question from me. Thanks.

Kacjer avatar Dec 06 '23 10:12 Kacjer

This class is called after training is finished when doing real-time implementation. After training, you get a well trained Q network (critic network) and actor network. We don't use this actor network but import the trained Q network into the class. Based on this Q network, we get the optimal action by solving the method provided in the actor class "predict_best_action" by inputting the state.

EnergyQuantResearch avatar Dec 06 '23 10:12 EnergyQuantResearch

Hello, how to test after training? There is no such link in the code. Could you please provide the process of running the test online described in your article in the code? How to use the trained Q-network in conjunction with MIP. Looking forward to your reply!

lmd123123 avatar Dec 15 '23 10:12 lmd123123

This class is called after training is finished when doing real-time implementation. After training, you get a well trained Q network (critic network) and actor network. We don't use this actor network but import the trained Q network into the class. Based on this Q network, we get the optimal action by solving the method provided in the actor class "predict_best_action" by inputting the state.

Hello, Hou. I am running your code. But I found that, you set self.num_episode = 3000 in class Argument. Why do you set it so large?

Bang0518 avatar Dec 21 '23 07:12 Bang0518

This class is called after training is finished when doing real-time implementation. After training, you get a well trained Q network (critic network) and actor network. We don't use this actor network but import the trained Q network into the class. Based on this Q network, we get the optimal action by solving the method provided in the actor class "predict_best_action" by inputting the state.

I want to know how you determine that the Q network is well-trained. Additionally, I use the OMLT package to model the trained network with MIP, but I find that the results are not as good as the Q network. I would like to know how you adjust the parameters

Mrdawnzero avatar Dec 29 '23 09:12 Mrdawnzero

Hello, Mr. Hou Shengren. After loading the trained Critic network into the Actor_MIP, it shows that the block is a 'NoneType' object and is not callable, and the formulation is also invalid. How should this be corrected? 微信图片_20240725205954 微信图片_20240725205942

WangDing1030 avatar Jul 25 '24 13:07 WangDing1030

1722501996884 Why do I call your code and can not fully achieve the effect of your paper, or there will be a problem over the limit, hope to be answered, and crazy line warning information, how to solve it! 1722501996884

QIUQIUSCI avatar Aug 01 '24 08:08 QIUQIUSCI

Hello, how to test after training? There is no such link in the code. Can you provide in your code the procedure for running tests online as described in your article? How to use a trained Q-network with MIP. Looking forward to your reply! Why don't you open source important content

QIUQIUSCI avatar Aug 01 '24 08:08 QIUQIUSCI

Hello, Mr. Hou Shengren. After loading the trained Critic network into the Actor_MIP, it shows that the block is a 'NoneType' object and is not callable, and the formulation is also invalid. How should this be corrected? 微信图片_20240725205954 微信图片_20240725205942

Can I talk to you? I also found that it is not possible to achieve the limits of the author's paper

QIUQIUSCI avatar Aug 01 '24 09:08 QIUQIUSCI

This class is called after training is finished when doing real-time implementation. After training, you get a well trained Q network (critic network) and actor network. We don't use this actor network but import the trained Q network into the class. Based on this Q network, we get the optimal action by solving the method provided in the actor class "predict_best_action" by inputting the state.

I want to know how you determine that the Q network is well-trained. Additionally, I use the OMLT package to model the trained network with MIP, but I find that the results are not as good as the Q network. I would like to know how you adjust the parameters

Can I talk to you? I also found that it is not possible to achieve the limits of the author's paper

QIUQIUSCI avatar Aug 01 '24 09:08 QIUQIUSCI

This class is called after training is finished when doing real-time implementation. After training, you get a well trained Q network (critic network) and actor network. We don't use this actor network but import the trained Q network into the class. Based on this Q network, we get the optimal action by solving the method provided in the actor class "predict_best_action" by inputting the state.在执行实时实现时,在训练完成后调用该类。经过训练,您将获得训练有素的 Q 网络 (critic 网络) 和 actor 网络。我们不使用这个 actor 网络,而是将训练好的 Q 网络导入到类中。基于这个 Q 网络,我们通过输入状态来求解 actor 类 “predict_best_action” 中提供的方法,从而获得最佳动作。

I want to know how you determine that the Q network is well-trained. Additionally, I use the OMLT package to model the trained network with MIP, but I find that the results are not as good as the Q network. I would like to know how you adjust the parameters我想知道您如何确定 Q 网络训练有素。此外,我使用 OMLT 包通过 MIP 对经过训练的网络进行建模,但我发现结果不如 Q 网络。我想知道您如何调整参数

Can I talk to you? I also found that it is not possible to achieve the limits of the author's paper我可以和你谈谈吗?我还发现,不可能达到作者论文的极限

how can I talk to you,can you give me your email?

Mrdawnzero avatar Aug 27 '24 05:08 Mrdawnzero

This class is called after training is finished when doing real-time implementation. After training, you get a well trained Q network (critic network) and actor network. We don't use this actor network but import the trained Q network into the class. Based on this Q network, we get the optimal action by solving the method provided in the actor class "predict_best_action" by inputting the state.在执行实时实现时,在训练完成后调用该类。经过训练,您将获得训练有素的 Q 网络 (critic 网络) 和 actor 网络。我们不使用这个 actor 网络,而是将训练好的 Q 网络导入到类中。基于这个 Q 网络,我们通过输入状态来求解 actor 类 “predict_best_action” 中提供的方法,从而获得最佳动作。

I want to know how you determine that the Q network is well-trained. Additionally, I use the OMLT package to model the trained network with MIP, but I find that the results are not as good as the Q network. I would like to know how you adjust the parameters我想知道您如何确定 Q 网络训练有素。此外,我使用 OMLT 包通过 MIP 对经过训练的网络进行建模,但我发现结果不如 Q 网络。我想知道您如何调整参数

Can I talk to you? I also found that it is not possible to achieve the limits of the author's paper我可以和你谈谈吗?我还发现,不可能达到作者论文的极限

how can I talk to you,can you give me your email?

my email [email protected]

QIUQIUSCI avatar Sep 12 '24 13:09 QIUQIUSCI