FBGEMM

FBGEMM copied to clipboard

Reame
Issues

Optimize the cache fetch for forward split, pt. 1

Open q10 opened this issue 2 years ago • 9 comments

Summary: Rewrite the kernel to use cache_hit_rate enum as template argument. We first check if the cache is empty and pass that value as a template argument. Inside the first kernel, we then determine the cache conflict miss rate, and use this value to as a template parameter when invoking the second kernel, which performs the actual lookup work.

We pass in uvm_cache_stats as a run-time argument here instead of passing the cache miss rate as a compile-time argument, because uvm_cache_stats data is only available on the GPU, and incoking a templatized kernel with the cache miss rate as a template argument requires the cache misse information to first be passed back to the host, which is an expensive operation.

Differential Revision: D48937380

Sep 08 '23 04:09 q10

Deploy Preview for pytorch-fbgemm-docs canceled.

Name	Link
Latest commit	804755212c13fb98fcdd473fe7fc7096cb22f877
Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6513bb4f84737d0007e6e671

Sep 08 '23 04:09 netlify[bot]

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 08 '23 04:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 08 '23 06:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 08 '23 06:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 08 '23 06:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 12 '23 03:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 12 '23 03:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 27 '23 05:09 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D48937380

Sep 27 '23 05:09 facebook-github-bot