ppo_libtorch
ppo_libtorch copied to clipboard
where is the formula in c++ file
https://github.com/Mikoto10032/DeepLearning/blob/master/books/%5B%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%5D%5BHung-yi%20Lee%5D/PPO%20(v3).pdf
in this pdf page 9. formula as this 𝑝𝜃 𝜏 = 𝑝 𝑠1 𝑝𝜃 𝑎1|𝑠1 𝑝 𝑠2|𝑠1, 𝑎1 𝑝𝜃 𝑎2|𝑠2 𝑝 𝑠3|𝑠2, 𝑎2 ⋯
where is the formula in c++ file? which function implement it? or where define it? help me find out
In Bayes network its real calculate the conditional probability (http://dlib.net/bayes_net_ex.cpp.html)
PPO algorithm have this formula ex: 𝑝𝜃(𝑎𝑡|𝑠t) https://github.com/Mikoto10032/DeepLearning/blob/master/books/%5B%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%5D%5BHung-yi%20Lee%5D/PPO%20(v3).pdf
I can not connect the 𝑝𝜃(𝑎𝑡|𝑠t) to source code... or a lot of summation Y = W x Input + B represent this probability? I am confused with the formula relate to source code. please help solve it
you wont find this exact formula but only the probability of taking an action here. The logarithmic probability is computed by how far off the action is from the current distribution. I think the formula in this pdf merely shows the properties of a Markov Chain, which is that each action is independent on the previous states, but only depends on the current state. Hope this helps
mhubii thanks so. even in the pytorch layer still can not find the formula ex: 𝑝𝜃(𝑎𝑡|𝑠t) or 𝑝𝜃'(𝑎𝑡|𝑠t) the PPO formul in pytorch just a kind conditional probability. am I right?