metaseq
metaseq copied to clipboard
convert_to_singleton doesn't seem to handle bias properly
- Take a 125m pretrained checkpoint.
- Consolidate the checkpoint using convert_to_singleton.py
- Try loading the model behind the metaseq API.
RuntimeError: Error(s) in loading state_dict for TransformerLanguageModel: Missing key(s) in state_dict: "decoder.layers.0.fc2.bias", "decoder.layers.1.fc2.bias",