rust-bert
rust-bert copied to clipboard
Failed to run custom model with convert_model.py
Hi, I can't get it to work and I was wondering if you could advise me if the cause is the custom model itself, the conversion method or how to use it?
The conversion is fine, but an error Tch tensor error: cannot find the tensor named model.encoder.embedded_positions.weight
occurs at runtime.
I am trying to convert this model. https://huggingface.co/staka/fugumt-en-ja
I have added the model to your code as follows. https://github.com/solaoi/rust-bert/commit/e7d4e1d7351f4363736771b6d3f43521ab25e0ba
usage:
pub fn translate_to_japanese(text: &str) -> anyhow::Result<String> {
let model_resource = RemoteResource::from_pretrained(MarianModelResources::ENGLISH2JAPANESE);
let config_resource =
RemoteResource::from_pretrained(MarianConfigResources::ENGLISH2JAPANESE);
let vocab_resource = RemoteResource::from_pretrained(MarianVocabResources::ENGLISH2JAPANESE);
let merges_resource = RemoteResource::from_pretrained(MarianSpmResources::ENGLISH2JAPANESE);
let source_languages = MarianSourceLanguages::ENGLISH2JAPANESE;
let target_languages = MarianTargetLanguages::ENGLISH2JAPANESE;
let translation_config = TranslationConfig::new(
ModelType::Marian,
model_resource,
config_resource,
vocab_resource,
Some(merges_resource),
source_languages,
target_languages,
Device::Cpu
);
let model = TranslationModel::new(translation_config)?;
let output = model.translate(&[text], None, None)?;
Ok(output.join(""))
}
I've found that some models do not have model.encoder.embedded_positions.weight
in the conversion logs.
The default model (i.e. OPUS-MT-EN-ROMANCE) has the following conversion logs.
converted model.encoder.embed_positions.weight - 128 bytes
converted model.decoder.embed_positions.weight - 128 bytes
but this model doesn't have these logs either. https://huggingface.co/Helsinki-NLP/opus-tatoeba-en-ja
I added two parameters ( model.encoder.embed_positions.weight
, model.decoder.embed_positions.weight
) as below.
This solved the problem.
But the translation quality is too stupid, so how to convert is something wrong.
// add this
def sinusoidal_positional_embedding(max_seq_len, d_model):
position = torch.arange(max_seq_len, dtype=torch.float32).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pos_emb = torch.empty(max_seq_len, 1, d_model)
pos_emb[:, 0, 0::2] = torch.sin(position * div_term)
pos_emb[:, 0, 1::2] = torch.cos(position * div_term)
return pos_emb
if __name__ == "__main__":
...
nps = {}
target_folder = Path(args.source_file[0]).parent
// add this
max_seq_len = 512 # config.json(max_position_embeddings)
d_model = 512 # config.json(d_model)
position_embeddings = sinusoidal_positional_embedding(max_seq_len, d_model)
nps["model.encoder.embed_positions.weight"] = torch.nn.Parameter(position_embeddings.squeeze(1))
nps["model.decoder.embed_positions.weight"] = torch.nn.Parameter(position_embeddings.squeeze(1))
for source_file in args.source_file:
...
// add this
nps = {k: v.detach().numpy() if torch.is_tensor(v) else v for k, v in nps.items()}
np.savez(target_folder / "model.npz", **nps)
...
I think maybe the num_beams in config.json is not properly handled. MarianConfig and BertConfig are equivalent as follows. https://github.com/guillaume-be/rust-bert/blob/c37eb32857edb4de0b76066c39b5de52ac7db7dd/src/marian/marian_model.rs#L524
BertConfig has no num_beams. https://github.com/guillaume-be/rust-bert/blob/c37eb32857edb4de0b76066c39b5de52ac7db7dd/src/bert/bert_model.rs#L141-L158
hey @solaoi
I encountered a similar issue before
Tch tensor error: cannot find the tensor named distilbert.transformer.layer.5.sa_layer_norm.weight
It can be due to the naming conventions or internal structure of the model.
To resolve it, I had to run the conversion script like
python utils/convert_model.py --prefix distilbert. /path/to/msmarco-distilbert-base-v3/pytorch_model.bin