int4-TBE auto-vec optimization by increasing prefetching
Summary: Increase prefetching and reduce backend stall as is suggested by NVIDIA
Differential Revision: D53552699
Deploy Preview for pytorch-fbgemm-docs ready!
| Name | Link |
|---|---|
| Latest commit | bab6fd5d850fc1bb43ba1102caafc7c2c48bff23 |
| Latest deploy log | https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/65d6532624e3230008dd91a8 |
| Deploy Preview | https://deploy-preview-2325--pytorch-fbgemm-docs.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
This pull request was exported from Phabricator. Differential Revision: D53552699
This pull request was exported from Phabricator. Differential Revision: D53552699
This pull request was exported from Phabricator. Differential Revision: D53552699
This pull request has been merged in pytorch/FBGEMM@f6195314346935c7b10909a1d5e70a97357ffd3c.