CTranslate2 can't convert opennmt.py model with alibi or rotary embeddings to ctranslate2

I get this error when max_relative_positions: -1 or max_relative_positions: -2

Traceback (most recent call last):
  File "/opt/conda/bin/onmt_release_model", line 33, in <module>
    sys.exit(load_entry_point('OpenNMT-py==3.2.0', 'console_scripts', 'onmt_release_model')())
  File "/opt/conda/lib/python3.10/site-packages/OpenNMT_py-3.2.0-py3.10.egg/onmt/bin/release_model.py", line 35, in main
    converter.convert(opt.output, force=True, quantization=opt.quantization)
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 89, in convert
    model_spec = self._load()
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 189, in _load
    return _get_model_spec_seq2seq(
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 89, in _get_model_spec_seq2seq
    set_transformer_spec(model_spec, variables)
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 199, in set_transformer_spec
    set_transformer_encoder(spec.encoder, variables)
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 204, in set_transformer_encoder
    set_input_layers(spec, variables, "encoder")
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 230, in set_input_layers
    set_position_encodings(
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 330, in set_position_encodings
    spec.encodings = _get_variable(variables, "%s.pe" % scope).squeeze()
  File "/opt/conda/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 334, in _get_variable
    return variables[name].numpy()
KeyError: 'encoder.embeddings.make_embedding.pe.pe'

Jun 09 '23 20:06 totaltube

what version or codebase of OpenNMT-py did you use? it seems that you have both position_encoding=True and max_relative_position !=0 it is now tested and avoided: https://github.com/OpenNMT/OpenNMT-py/blame/master/onmt/utils/parse.py#L302

Jun 10 '23 08:06 vince62s

what version or codebase of OpenNMT-py did you use? it seems that you have both position_encoding=True and max_relative_position !=0 it is now tested and avoided: https://github.com/OpenNMT/OpenNMT-py/blame/master/onmt/utils/parse.py#L302

Master version on the time of writting. position_encoding: false

Jun 10 '23 08:06 totaltube

With max_relative_position = 20 - converstion goes ok, but with -1 or -2 it fails.

Jun 10 '23 08:06 totaltube

I see that you are trying to convert an encoder-decoder model (_get_model_spec_seq2seq is in the stack trace), but the converter currently does not handle max_relative_positions: -1 or max_relative_positions: -2 for these models.

Jun 12 '23 09:06 guillaumekln

I see that you are trying to convert an encoder-decoder model (_get_model_spec_seq2seq is the stack trace), but the converter currently does not handle max_relative_positions: -1 or max_relative_positions: -2 for these models.

Yes. Ok, just wanted to test these new options for nmt tasks. Anyways, old options works good)

Jun 12 '23 10:06 totaltube

you still can assess your model with regular opennmt-py inference, I am also intrested in such results. we'll add those options in the encoder/decoder config if it makes sense.

Jun 12 '23 20:06 vince62s

At least, I tested it with the following options: add_ffnbias: false, multiquery: true, add_qkvbias: false. I also added other layers to ensure that the model has the same or more parameters. However, it performed worse compared to the standard options.

Jun 12 '23 20:06 totaltube

CTranslate2 CTranslate2 copied to clipboard

can't convert opennmt.py model with alibi or rotary embeddings to ctranslate2

CTranslate2
CTranslate2 copied to clipboard