chowarfb

Results 1 issues of chowarfb

Summary: `lru_cache_insert` kernel is not sensitive to #SMs (latency bound). Using less SMs avoids structural hazard on the main training stream. https://docs.google.com/document/d/1p3Id8HfVMfyFn4ZcL4e79Rl0ktTSevnW3jXm9PTy0ys/edit#bookmark=id.lyjw9rtmebv0 Given the performance optimized config is with pipelining,...

fb-exported
cla signed