Rayhane Mama

Results 32 comments of Rayhane Mama

Hello, if by "reproducing the results" you mean train the best model we achieved, then yes you can do it in simple python. You can find the hparams in the...

Checkpoints status update (12/22) and (11/11 datasets): - We uploaded at least one model for each dataset (either in JAX or Pytorch). Missing checkpoints in either library will be added...

Hello @turian and thanks for showing interest in our work. We have added custom dataset support in our [latest commit](https://github.com/Rayhane-mamah/Efficient-VDVAE/commit/ef9ace94a47589104dbae17d9f7cffacf328989b). Sufficient instructions on how to use are available in [this...

Hi @Kim-Dongjun! Thank you for your kind words! We also hold great respect for your work on UDM. The main goal of our work is to make Very deep VAEs,...

Hello @prafullasd Thank you for the interest you show in this work and thanks for reaching out about his issue! You bring up a very good point. We have had...

@LeoniusChen @LEEYOONHYUNG the problem isn't really about the beta being too small and thus pushing the attention forward too fast. The most likely (verified) problem here is that during the...

Ah I missed the reverse part. Nice catch. No I am using a simple reduction factor of 1 and the alignment is working well. I can imagine tho that the...

Thanks for you quick reply @A-Jacobson. About the teacher forcing, this is actually a nice perspective I haven't thought about since I only considered using the "always on" teacher forcing....

Yeah i'm supposing that information is implicitly being carried forward since each context vector is computed using the previous one.. I am indeed using zoneout LSTMs (unless they're not working...

Hello again @A-Jacobson, sorry for the late reply. If your attention works, I would definitely switch to yours too, it seems cleaner (and let's face it, less layers = faster...