metaseq
metaseq copied to clipboard
Split model parallel transformer layer to encoder / decoder files
Similar to how the refactor was done for non-model-parallel version, we should split https://github.com/facebookresearch/metaseq/blob/main/metaseq/model_parallel/modules/transformer_layer.py to two files (encoder vs decoder) before trying to unify the codepaths between model-parallel vs non-model-parallel (https://github.com/facebookresearch/metaseq/issues/389).
This refactor was held off to get seq parallel merged, though the amount of merge conflicts has likely remained the same 😅 (cc @ngoyal2707 to coordinate).