Nikos Karampatziakis
Nikos Karampatziakis
Thank you! We are looking to compare with offline REM and offline QR DQN on 1%, 10%, 20% and 100% of data for the following games: breakout, seaquest, pong, asterix,...
Resolved offline
Sorry, I don't know if `StaticCache` can be supported in a similar way. I should also mention a couple things. The operations `from_legacy_cache`, `to_legacy_cache`, and `reorder_cache` are not really tested...
Can anyone confirm whether they can repro this?
Thanks @BobMcDear for trying all these different setups. I can also verify on my setup that ```python tl.dot(H, X, allow_tf32=False) ``` resolves the problem and looking forward to a better...
It is common for people to develop and debug on a lower end GPU without support for TensorFloat32, then deploy the same code on a GPU that supports it. I...
I'm going to test `reorder_cache` today-ish and fix any issues. I can make a PR afterwards but I need some guidance on how this should be integrated. New class? Option...
As an update, I am a bit blocked because beam search has become extremely slow with this approach. Profiling suggests that `torch.index_select` operations on the CPU are very slow.
> would transferring the tensor on device when this happens (in the cache class) not be efficient enough? Nah, too much back and forth. What seems to be working is...
Will do. Besides this class, are there any other changes I need to have in the PR?