Enrico Shippole
Enrico Shippole
@ptillet @yuguo68 Thank you for the additional information. I will review #759 and the examples in `test_dequantize.py`. Support for Triton `int8` and `uint8` dtype conversions would be greatly beneficial.
@vwxyzjn Thank you for the response. I will review the paper which you provided. Do you have any advice on improving the stability of policy updates while still maintaining that...
Hi @vwxyzjn, > Hi Enrico, A2C is just a set of hyper-parameters for PPO. How to make the policy updates more stable remains an open question. I think you can...
Hi @danthe3rd , Thank you for the insight. I enjoyed your twitter thread on the sequence parallel operators. I installed xformers through pip for cu121 which comes with pytorch 2.2.0....
Hi @Aleksandar1932 , I was just looking into Poetry for package management the other day, so this pull request comes at the perfect time. I have previously used Twine and...
Hi @pommedeterresautee , I tried to follow the same code structure as kernl's forward pass to remain consistent when working on my backward kernel implementation above. I reviewed Tri's implementation...
> @conceptofmind thanks for pointing out the need for a contiguous > > for the latter issue, i don't really know off the top of my head, not without spending...
> are you aware that soft moe will not work in LLMs? also have this built https://github.com/lucidrains/st-moe-pytorch Hopefully can provide more information to fulfill that curiosity soon :smile: Can send...
> @conceptofmind yes indeed, i've seen some papers using mixture of experts in text-to-image models with great success > > nice, as long as you are aware! Hi Phil, We...
Do you have any idea of whether Ulysses is plug-and-play with ring attention from this repository? I do understand it works with standard Flash Attention.