torchtitan
                                
                                 torchtitan copied to clipboard
                                
                                    torchtitan copied to clipboard
                            
                            
                            
                        Make Transformer tolerate missing layers for PP
Stack from ghstack (oldest at bottom):
- #318
- -> #322
- #321
A few small changes here lets manual PP frontend 'reconfigure' a whole transformer model to a stage's portion simply by setting undesired layers to None (in cases of top level layers) or deleting them from the ModuleDict (for 'layers.*').
These changes don't impact the FQNs of the remaining layers, which is critical for checkpoint load/save compatibility.