TensorKart
TensorKart copied to clipboard
Modified button pushes to be probabilities
Currently the network returns a value approximately between 0-1 and this is rounded to either push the button (1) or not (0). Instead, I tried truncating the value to between 0-1 and interpreting this as a probability. The probability is used by np.random.choice to pseudorandomly choose whether to press the buttons during each update, the resulting 0/1 is then sent as the controller instruction.
Thanks for your patience as I figure out git, etc. It looks like there is a conflict but it is only regarding a comment.
Hmm I tried to fix the conflicts but it didn't work 100%. I think you'll have to fix this locally