Nikos Karampatziakis comments

Results 14 comments of


                                            Nikos Karampatziakis

Raw results

Thank you! We are looking to compare with offline REM and offline QR DQN on 1%, 10%, 20% and 100% of data for the following games: breakout, seaquest, pong, asterix,...

Sorry, I don't know if `StaticCache` can be supported in a similar way. I should also mention a couple things. The operations `from_legacy_cache`, `to_legacy_cache`, and `reorder_cache` are not really tested...

IndexError: map::at when using tl.dot

Can anyone confirm whether they can repro this?

IndexError: map::at when using tl.dot

Thanks @BobMcDear for trying all these different setups. I can also verify on my setup that ```python tl.dot(H, X, allow_tf32=False) ``` resolves the problem and looking forward to a better...

IndexError: map::at when using tl.dot

It is common for people to develop and debug on a lower end GPU without support for TensorFloat32, then deploy the same code on a GPU that supports it. I...

KV cache with CPU offloading

I'm going to test `reorder_cache` today-ish and fix any issues. I can make a PR afterwards but I need some guidance on how this should be integrated. New class? Option...

KV cache with CPU offloading

As an update, I am a bit blocked because beam search has become extremely slow with this approach. Profiling suggests that `torch.index_select` operations on the CPU are very slow.

KV cache with CPU offloading

> would transferring the tensor on device when this happens (in the cache class) not be efficient enough? Nah, too much back and forth. What seems to be working is...

KV cache with CPU offloading

Will do. Besides this class, are there any other changes I need to have in the PR?