Transformers.jl
Transformers.jl copied to clipboard
scibert models missing loading_method?
I can load all the bert models but none of the scibert ones:
julia> bert_model, wordpiece, tokenizer = pretrain"bert-uncased_L-12_H-768_A-12"
[ Info: loading pretrain bert model: uncased_L-12_H-768_A-12.tfbson
...
julia> bert_model, wordpiece, tokenizer = pretrain"scibert-scibert_scivocab_uncased"
ERROR: unknown pretrain type
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] loading_method(x::Val{:scibert})
@ Transformers.Pretrain ~/.julia/packages/Transformers/jtjKq/src/pretrain/Pretrain.jl:46
[3] load_pretrain(str::String; kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Transformers.Pretrain ~/.julia/packages/Transformers/jtjKq/src/pretrain/Pretrain.jl:58
[4] load_pretrain(str::String)
@ Transformers.Pretrain ~/.julia/packages/Transformers/jtjKq/src/pretrain/Pretrain.jl:57
[5] top-level scope
@ REPL[12]:1
Seems there is no loading_method
for :scibert.
Ok, just realized the scibert ones are seen as bert models and can thus be loaded with the bert loading_method so this works:
bert_model, wordpiece, tokenizer = pretrain"bert-scibert_scivocab_uncased"
What lead me astray was looking at the output from the pretrains
method which made me think that one should concat the 2nd and 3rd column to get the model name to send to pretrain
.
julia> pretrains()
Type model model name support items
–––– ––––––– –––––––––––––––––––––––––––– ––––––––––––––––––––––––––––––––
Gpt gpt OpenAIftlm gpt_model, bpe, vocab, tokenizer
Bert scibert scibert_scivocab_uncased bert_model, wordpiece, tokenizer
Bert scibert scibert_basevocab_cased bert_model, wordpiece, tokenizer
Bert scibert scibert_basevocab_uncased bert_model, wordpiece, tokenizer
Bert scibert scibert_scivocab_cased bert_model, wordpiece, tokenizer
Bert bert cased_L-12_H-768_A-12 bert_model, wordpiece, tokenizer
Bert bert wwm_cased_L-24_H-1024_A-16 bert_model, wordpiece, tokenizer
Bert bert uncased_L-12_H-768_A-12 bert_model, wordpiece, tokenizer
Bert bert multi_cased_L-12_H-768_A-12 bert_model, wordpiece, tokenizer
Bert bert wwm_uncased_L-24_H-1024_A-16 bert_model, wordpiece, tokenizer
Bert bert multilingual_L-12_H-768_A-12 bert_model, wordpiece, tokenizer
Bert bert chinese_L-12_H-768_A-12 bert_model, wordpiece, tokenizer
Bert bert cased_L-24_H-1024_A-16 bert_model, wordpiece, tokenizer
Bert bert uncased_L-24_H-1024_A-16 bert_model, wordpiece, tokenizer
Possibly this could be clarified in the output and/or documentation. Sorry if I just missed it.
It is written on the docstring for @pretrain_str
, but I agree that might be a little misleading.