ppo_libtorch where is the formula in c++ file

where is the formula in c++ file

Open fatalfeel opened this issue 5 years ago • 3 comments

https://github.com/Mikoto10032/DeepLearning/blob/master/books/%5B%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%5D%5BHung-yi%20Lee%5D/PPO%20(v3).pdf

in this pdf page 9. formula as this 𝑝𝜃 𝜏 = 𝑝 𝑠1 𝑝𝜃 𝑎1|𝑠1 𝑝 𝑠2|𝑠1, 𝑎1 𝑝𝜃 𝑎2|𝑠2 𝑝 𝑠3|𝑠2, 𝑎2 ⋯

where is the formula in c++ file? which function implement it? or where define it? help me find out

Feb 17 '20 22:02 fatalfeel

In Bayes network its real calculate the conditional probability (http://dlib.net/bayes_net_ex.cpp.html)

PPO algorithm have this formula ex: 𝑝𝜃(𝑎𝑡|𝑠t) https://github.com/Mikoto10032/DeepLearning/blob/master/books/%5B%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%5D%5BHung-yi%20Lee%5D/PPO%20(v3).pdf

I can not connect the 𝑝𝜃(𝑎𝑡|𝑠t) to source code... or a lot of summation Y = W x Input + B represent this probability? I am confused with the formula relate to source code. please help solve it

Feb 18 '20 22:02 fatalfeel

you wont find this exact formula but only the probability of taking an action here. The logarithmic probability is computed by how far off the action is from the current distribution. I think the formula in this pdf merely shows the properties of a Markov Chain, which is that each action is independent on the previous states, but only depends on the current state. Hope this helps

Mar 19 '20 21:03 mhubii

mhubii thanks so. even in the pytorch layer still can not find the formula ex: 𝑝𝜃(𝑎𝑡|𝑠t) or 𝑝𝜃'(𝑎𝑡|𝑠t) the PPO formul in pytorch just a kind conditional probability. am I right?

Mar 20 '20 20:03 fatalfeel

ppo_libtorch ppo_libtorch copied to clipboard

where is the formula in c++ file

ppo_libtorch
ppo_libtorch copied to clipboard