VIOLA
VIOLA copied to clipboard
Policy Training
Is the architecture training different policies for different tasks? If not how are we specifying the task at test time?