NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

Has Biomegatron been removed from huggingface.co/models?

Open jiwonjoung opened this issue 2 years ago • 3 comments

I can't seem to get nemo to recognize biomegatron as a pretrained model in the config file. For example, if I put: config.model.language_model.pretrained_model_name = 'biomegatron345m_biovocab_30k_cased' in my config file, as per (https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Token_Classification-BioMegatron.ipynb). I get this error:

OSError: biomegatron345m_biovocab_30k_cased is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.

And upon scrolling through the models on huggingface.co/models, I do not see a biomegatron model listed.

jiwonjoung avatar Jun 30 '22 16:06 jiwonjoung

Im facing same issue. I am unable to import it. And Im unable to run it using biobert

FatimaArshad-DS avatar Jul 01 '22 09:07 FatimaArshad-DS

I'm having trouble reproducing this. I can run the notebook on my workstation. Also, this model is not currently on HuggingFace. It's hosted on NGC:

PretrainedModelInfo(
        pretrained_model_name=biomegatron345m_biovocab_30k_cased,
        description=Megatron 345m parameters model with biomedical vocabulary ({vocab_size} size) {vocab}, pre-trained on PubMed biomedical text corpus.,
        location=https://api.ngc.nvidia.com/v2/models/nvidia/nemo/biomegatron345m_biovocab_30k_cased/versions/1/files/BioMegatron345m-biovocab-30k-cased.nemo
)

What is the full output you get from this cell?

model_ner = nemo_nlp.models.TokenClassificationModel(cfg=config.model, trainer=trainer)

ericharper avatar Jul 01 '22 23:07 ericharper

I'm having trouble reproducing this. I can run the notebook on my workstation. Also, this model is not currently on HuggingFace. It's hosted on NGC:

PretrainedModelInfo(
        pretrained_model_name=biomegatron345m_biovocab_30k_cased,
        description=Megatron 345m parameters model with biomedical vocabulary ({vocab_size} size) {vocab}, pre-trained on PubMed biomedical text corpus.,
        location=https://api.ngc.nvidia.com/v2/models/nvidia/nemo/biomegatron345m_biovocab_30k_cased/versions/1/files/BioMegatron345m-biovocab-30k-cased.nemo
)

What is the full output you get from this cell?

model_ner = nemo_nlp.models.TokenClassificationModel(cfg=config.model, trainer=trainer)

So actually, I'm trying to implement medical entity linking with the biomegatron model. I can run the token classification with biomegatron just fine. However, when I run the entity linking tutorial after adding `config.model.language_model.pretrained_model_name = 'biomegatron345m_biovocab_30k_cased'

then i get the error. Perhaps I need to make more configuration adjustments to use biomegatron with the entity linking tutorial? If so , what am I missing?

jiwonjoung avatar Jul 11 '22 19:07 jiwonjoung

This issue is stale because it has been open for 60 days with no activity.

github-actions[bot] avatar Sep 29 '22 02:09 github-actions[bot]