Bui Chi Trung
Bui Chi Trung
This PR implement [torchvision.ops.sigmoid_focal_loss](https://pytorch.org/vision/main/generated/torchvision.ops.sigmoid_focal_loss.html) operation. There is no constraint here, MIOpen is faster than ROCm in all cases. - [x] Added SigmoidFocalLoss operation with forward and backward kernels. - [x]...
- [x] Added Kthvalue operation with forward kernels. - [x] Added driver test and gtest. - [x] Compared with ROCm. ### Compare to ROCm The kernel is only 20% faster...
### Ticket Link to Github Issue: #13320 ### Problem description Implement pytorch Aten operation: uniform in ttnn using NPU kernels. ### What's changed Nothing changed. Once this PR is approved,...
**Is your feature request related to a problem? Please describe.** Currently, tt doesn't support uniform operation to run on NPU. It instead run on CPU. **Describe the solution you'd like**...
**Is your feature request related to a problem? Please describe.** In llk header file comment, this line is often used. ``` * The DST register buffer must be in acquired...