eellison
eellison
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #140838 * #140841 This torch.cat kernel has multiple subblocks which load from the same input. We were incorrectly reusing the mask vars...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #140057 * __->__ #139945 * #139944 A couple changes. - Tries to reuse dtype propagation rules that were already registered in inductor. These...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #137904 Previously the decomposition would upcasts inputs to fp32. This led to a slowdown compared to eager which would run in fp16....
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #140057 * #139945 * #139944 Adds the remaining unimplemented ops as well as an assertion failure if someone adds a new op...