torchtitan icon indicating copy to clipboard operation
torchtitan copied to clipboard

Make Transformer tolerate missing layers for PP

Open wconstab opened this issue 1 year ago • 0 comments

Stack from ghstack (oldest at bottom):

  • #318
  • -> #322
  • #321

A few small changes here lets manual PP frontend 'reconfigure' a whole transformer model to a stage's portion simply by setting undesired layers to None (in cases of top level layers) or deleting them from the ModuleDict (for 'layers.*').

These changes don't impact the FQNs of the remaining layers, which is critical for checkpoint load/save compatibility.

wconstab avatar May 10 '24 23:05 wconstab