Chao Zhang
Chao Zhang
Looks like the issue is in this line: ``` tril = torch.zeros(1, 1, 1, device=a.device, dtype=out.dtype) ``` as changing this line to: ``` tril = torch.zeros(1, 1, 1).cuda() ``` appears...
You're probably facing a different but possibly related issue. Can you file a new bug report with the above information?
Updated the algorithm to remove unnecessary gather operations... now only one gather operation is needed whereas multiple were used before. Besides being more efficient, this also side steps a problem...
I'll put up a fix shortly.
Ah I see, the bigger issue is that the `torch.clamp` implementations currently only handle single scalars instead of `torch.tensors` for min and max args (at least, that seems to be...
I'll have a PR up shortly.
I'll put up a fix shortly.
I'll take a look at this issue if needed, but I think this really only comes up with dynamic shapes. In these situations, the user usually knows a priori that...
I'll put up a fix shortly.