ELMoForManyLangs icon indicating copy to clipboard operation
ELMoForManyLangs copied to clipboard

ELMo weights.hdf5 and options.hdf5 files?

Open veronica320 opened this issue 7 years ago • 4 comments

Thanks for this work! Could you please make available the weights and options file (in .hdf5 format), like how the allennlp pre-trained model works?

veronica320 avatar Aug 16 '18 20:08 veronica320

+1, that will be perfect for many developers...

tnlin avatar Sep 21 '18 01:09 tnlin

Sorry for late reply.

To my understanding, our release is not directly portable to AllenNLP because we support unicode characters. This leads to difference in model architecture. We have a char_emb layer of variable length to convert unicode character to embeddings, while they use a fix-sized char embedding layer .

Unfortunately, we don't have a good solution to make our release works with AllenNLP by now. I will leave this issue open to see any potential solution. Any solution or suggestion will be welcomed.

Oneplus avatar Oct 07 '18 16:10 Oneplus

For people looking for a quick and dirty way to embed a sentence at a time (which is slower than using batches) feel free to reuse my part copypated hacked up code. See embed_sentence from https://github.com/frankier/finntk/blob/2f0ba49cd86002528431903c090d28852356eff7/finntk/vendor/elmo.py

frankier avatar Oct 10 '18 12:10 frankier

For people looking to use AllenNLP framework -

A few people asked me, so I thought it's better also to put it here if people reach this thread - I've merged to the AllenNLP repo code to support cross-lingual ELMo (with alignment to a mutual space as described in our paper cross-lingual alignment of contextual embeddings).

However, that code still requires AllenNLP trained ELMos. I trained it for a few languages (unfortunately, not as many as in this great repo) and you can find more details here - https://github.com/TalSchuster/CrossLingualELMo

TalSchuster avatar Jun 19 '19 15:06 TalSchuster