bcf
bcf copied to clipboard
Can't reproduce the results of paper.
I find something strange in the results of the code. My results of navigation are as follows. The hyper-parameters are same with yours (Num_Agents =5).
Besides, I have a question for the BCF. I see your only evaluate the ensemble policy during training. Do we need the prior controller to do evaluation ? Or just use the ensemble policy to give action?