Edan Toledo comments

Results 81 comments of


                                            Edan Toledo

PPO Envpool doesn't account for episode change

Hey, i thought I'd also chime in here. I realised this difference and i simply made a wrapper to achieve the same auto-reset style as Gym API. My wrapper is...

[IDEA] Easiest way to implement Hindsight Relabeling?

I haven't read the hindsight relabeling paper so there might be context i am missing but this sounds achievable just with the trajectory buffer and no extra functionality. Correct me...

[MAINTAIN] JAX 0.6.0 not supported

Aah I had a feeling this would happen soon. I'll try do this asap. But ideally we fix it so we can use the latest version. Is the error with...

Index trajectories at particular index from buffer

You would have to specify the batch index as well as the starting time index but this is pretty easy functionality to add.

feat: switch sebublba to using shard_map like mava

Thanks so much, I'll try review this and test it tomorrow on a GPU.

feat: switch sebublba to using shard_map like mava

I just did a comparison, and it seems like sebulba on main is faster. Looking at all the timing statistics, its the pipeline that is slowing things down. Everything else...

feat: switch sebublba to using shard_map like mava

> I just did a comparison, and it seems like sebulba on main is faster. Looking at all the timing statistics, its the pipeline that is slowing things down. Everything...

feat: switch sebublba to using shard_map like mava

so i ended up completely refactoring the sebulba architectures taking a mix of inspiration from cleanba and instadeeps one. When timing the previous systems versus cleanba, cleanba was much faster...

This line prevents the use of jax.distributed.initialize when importing flashbax

I think its due to the import, however, its possible this issue is not correct. I'll try make a dummy script to check at some point.

Paper results reproduction scripts

Thanks for the response! So i'm having trouble getting any performance above zero (let alone matching the performance of the paper) for the door key 8x8 environment. Using these reproduction...