grobid
grobid copied to clipboard
Can´t use ELMO in DeepLearning Train
-
What is your OS and architecture? Ubuntu 22.04
-
What is your Java version (
java --version
)? openjdk version "1.8.0_342"
Hi @kermitt2, I have success to Train a citation model using DELFT but when I try to use ELMO y get this error message...
Loading data... Error: either provide a path to a directory with the ELMo model individual options and weight file or to the model in a ZIP archive. 10000 total sequences 9000 train sequences 1000 validation sequences ELMo weights used: /media/lopez/T51/embeddings/elmo_2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5
Where can I found this file and how can I change the path ?
Dear @rodyoukai,
You have to configure the ELMO embeddings under delft/delft/resource-registry.json
, See the file: https://github.com/kermitt2/delft/blob/master/delft/resources-registry.json
change the path near this section:
"embeddings-contextualized": [
{
"name": "elmo-en",
"path-config": "/media/lopez/T51/embeddings/elmo_2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json",
"path_weights": "/media/lopez/T51/embeddings/elmo_2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5",
"path-vocab": "data/models/ELMo/en/vocab.txt",
"path-cache": "data/models/ELMo/en/",
"cache-training": true,
"lang": "en",
"url_config": "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json",
"url_weights": "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5"
},
Are you using docker?
Hi @lfoppiano, thanks for your answer, I change the path and now i have this other issue:
embeddings loaded for 2196007 words and 300 dimensions ELMo weights used: /home/rcuellar/T51/embeddings/elmo_2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5 Error: either provide a path to a directory with the ELMo model individual options and weight file or to the model in a ZIP archive. Model for citation created in 1112798 ms
Maybe y nees download manually the hdf5 and json from aws?
Hi @rodyoukai
Yes you need to download manually the config and weights file for ELMo. Contrary to transformers and static embeddings, the automatic download is not written yet ! (my fault, I forgot to add it :)
The two url are still working, so you just need to download and move these files in the path you indicated.
ELMo is working very well for sequence labeling ! (and almost as fast as transformers in the last version of DeLFT)
Thanks @kermitt2 I will follow your instructions.
Hi again @kermitt2 and @lfoppiano, I got the weights and config from Amazon S3 and set all config correctly, I guess, but now I got this error when I try to train the citation model:
`Loading data... 17117 total sequences 15405 train sequences 1712 validation sequences ELMo weights used: /home/rcuellar/embeddings/T51/elmo_2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5 Traceback (most recent call last): File "/home/rcuellar/delft/env/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1380, in _do_call return fn(*args) File "/home/rcuellar/delft/env/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1363, in _run_fn return self._call_tf_sessionrun(options, feed_dict, fetch_list, File "/home/rcuellar/delft/env/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1456, in _call_tf_sessionrun return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict, tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor of shape [2048] and type float [[{{node bilm/CNN_high_1/b_carry/Initializer/initial_value}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "delft/applications/grobidTagger.py", line 351, in
Errors may have originated from an input operation.
Operation defined at: (most recent call last)
File "delft/applications/grobidTagger.py", line 351, in
train(model, File "delft/applications/grobidTagger.py", line 133, in train model = Sequence(model_name,
File "/home/rcuellar/delft/delft/sequenceLabelling/wrapper.py", line 112, in init self.embeddings = Embeddings(self.embeddings_name, resource_registry=self.registry, use_ELMo=use_ELMo)
File "/home/rcuellar/delft/delft/utilities/Embeddings.py", line 82, in init self.make_ELMo()
File "/home/rcuellar/delft/delft/utilities/Embeddings.py", line 320, in make_ELMo self.elmo_model.load(vocab_file=vocab_file,
File "/home/rcuellar/delft/delft/utilities/simple_elmo/elmo_helpers.py", line 203, in load self.sentence_embeddings_op = bilm(self.sentence_character_ids)
File "/home/rcuellar/delft/delft/utilities/simple_elmo/model.py", line 96, in call lm_graph = BidirectionalLanguageModelGraph(
File "/home/rcuellar/delft/delft/utilities/simple_elmo/model.py", line 284, in init self._build()
File "/home/rcuellar/delft/delft/utilities/simple_elmo/model.py", line 288, in _build self._build_word_char_embeddings()
File "/home/rcuellar/delft/delft/utilities/simple_elmo/model.py", line 442, in _build_word_char_embeddings b_carry = tf.compat.v1.get_variable(
File "/home/rcuellar/delft/delft/utilities/simple_elmo/model.py", line 273, in custom_getter return getter(name, *args, **kwargs)
Original stack trace for 'bilm/CNN_high_1/b_carry/Initializer/initial_value':
File "delft/applications/grobidTagger.py", line 351, in
Model for citation created in 50914 ms `
Can you help me or give me some advice?
Hi @rodyoukai !
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor of shape [2048] and type float
It seems that you're out of memory with ELMo, it's a good sign that it was loaded :)
What's your GPU memory capacity? Normally I think I got ELMo running well with 4GB GPU memory, but for training it's rather something like 8GB.
Hi @kermitt2, Thanks for the quick answer, I have 3GB for GPU, is there a kind of workaround for this?
Unfortunately training with ELMo embeddings is not manageable with this amount of GPU memory (I don't think you can run it neither below 4GB, but not sure).
Thanks for the answer @kermitt2, I have another question, do you know if exist ELMo embeddings in Spanish and where can I download?
Hi again @lfoppiano or @kermitt2 I need some advice, how can I create a ELMo embeddigns model in spanish? I supouse you create the english and french hdf5 files, can you share the process maybe I can generate my own embeddings in spanish....
Hi again @lfoppiano or @kermitt2 I need some advice, how can I create a ELMo embeddigns model in spanish? I supouse you create the english and french hdf5 files, can you share the process maybe I can generate my own embeddings in spanish....
@rodyoukai we used the already available ELMo embeddings provided by the authors of ELMo. I personally don't know exactly how to create a new ELMo embedding from scratch. You might check this issue https://github.com/kermitt2/delft/issues/155 but tests from Patrice did not show improvement so this implementation was not kept.
Hi @rodyoukai
I give you more details:
- English ELMo is from AI2, the authors of ELMo, at the time training a model (1B tokens) was done with 3 GPU during 2 weeks if I remember well
- Fench ELMo has been created by @pjox (who build OSCAR, which includes training resources for Spanish), he's likely the right person to contact if you want to push this forward
About elmoformanylangs, it includes an ELMo pretrained model for Spanish. However after integrating and benchmarking these models into DeLFT (and I think I did it correctly), you can see that they were marginally better than static embeddings, and not at all at the level of the "standard" ELMo models:
https://github.com/kermitt2/delft/issues/155#issuecomment-1400735270
Thanks for the answers @kermitt2 & @lfoppiano