richcmwang
richcmwang
Hi @afiaka87 The sparse attn page suggests setting ('full', 'sparse') to cycle between full and sparse attention: ```dalle = DALLE( # ... attn_types = ('full', 'sparse') # cycles between full...
Hi Phil @lucidrains, I notice a KL divergent term (default set to 0) in the `DiscreteVAE`. The paper often quoted (Neural discrete representation learning) have extra two stopgradient terms. Can...
How to make it work with Jupyterlab 3?
Hi, Thanks for the very nice work. Atom-Dark is the best looking theme. I was wondering whether we can incresase the font size. Also, can it be installed as an...
Hi, thank you for sharing the nice work. I notice the result of SupContrast with Moco trick on ImageNet 79.1. Do you have the plan to push the code here?...