composer
composer copied to clipboard
mean_resizing = True does not work with mixed/meta initialization
What does this PR do?
Transformers recently added in mean_resizing to resize_token_embeddings. This is breaking with mixed initialization in downstream training tasks that requires adding tokens to Composer Huggingface Models. This PR sets this value to False for now rather than defaulting to True.