LeonEricsson
LeonEricsson
Gotchu, it could probably slide into the speculative example; just want to make sure things remain modular as to not strain users attempting to understand and reimplement the examples I...
> > I haven't looked through the speculative example thoroughly since the change to T5 but I'll give it a look and try to decide what's most appropriate between 1)...
> > I'm trying to make things work with T5, I don't have the time to rewrite SpeculativeDecoder for LLM right now (think the goal should still be to change...
Implementation moved to #237.
I am able to replicate this. Seems to be a mismatch of the `residual_hidden_state` that are saved during downsampling compared to the resulting shape when `x` is upsampled. They should...
> Thank you for investigating. I did the padding exercise as you suggested to make the image `[544, 544,3]` but I still got the same mismatch of dimensions as reported...
> Another problem, as I discovered trying to do `[1024, 1024,3]` is that larger images than `[576, 576,3]` can easily generate runtime errors because of lack of resources. That should...
Also facing this issue with nvcr 24.01-py3
> For nvcr 23.12 and 24.01 please use flash-attn 2.5.1.post1 'preciate the immediate response 🙌🏼