rlpyt
rlpyt copied to clipboard
Why DQN puts predicted q value and target q value on cpu?
see https://github.com/astooke/rlpyt/blob/f04f23db1eb7b5915d88401fca67869968a07a37/rlpyt/agents/dqn/dqn_agent.py#L29 The predicted q value and target q value are calculated on GPU and then be put on cpu. Consequently, the dqn loss is calculated on cpu. I'm confused why not finish all on GPU.