torchrec icon indicating copy to clipboard operation
torchrec copied to clipboard

Filter out batch_fused from available kernels if fused_params is not explicitly set and contains optimizer

Open YLGH opened this issue 3 years ago • 1 comments

Summary: ATT

One common mishap with current optimizer fusion is that planner may select batch_fused even if fused_parmas is empty (thus optimizer defaults to SGD). This is dangerous as it changes the model's behavior without the author knowing.

Ideally fused params aren't being passed in at the sharder level, but inferred from the module itself, but this is a step in the right direction (i hope)

Differential Revision: D36180601

YLGH avatar May 05 '22 21:05 YLGH

This pull request was exported from Phabricator. Differential Revision: D36180601

facebook-github-bot avatar May 05 '22 21:05 facebook-github-bot