easy-bert
easy-bert copied to clipboard
Additional models for Java API
Are there any plans to add additional models, e.g. BERT-Large
, for Java API?
Not the maintainer, but I think you can just use the TFHub models available off the shelf, e.g.:
from easybert import Bert
bert = Bert("https://tfhub.dev/google/bert_cased_L-24_H-1024_A-16/1") # large cased model
bert.save("/path/to/your/model/")
In Java API the loading happens using additional dependencies. That's what I was referring to. I will update the issue description.
Hey, the Large models currently aren't provided because they exceed the artifact size maximum for Sonatype's Maven Repository (1GB), so I can't distribute them through there. If you can find an alternative public Maven repository that will allow larger files I can look into posting them there.
Ah, true. AFAIK, Bintray also allows only upto 250mb per jar file free for OSS projects.
Just an idea: perhaps it's possible to package a model into multiple jar files with 1GB/250mb each (depending on the hosting)? then one would have to add multiple dependencies to use a model, which is ok I suppose, e.g.:
dependencies {
compile "com.robrua.nlp.models:easy-bert-cased-L-12-H-768-A-16_part_1_of_3:1.0.0"
compile "com.robrua.nlp.models:easy-bert-cased-L-12-H-768-A-16_part_2_of_3:1.0.0"
compile "com.robrua.nlp.models:easy-bert-cased-L-12-H-768-A-16_part_3_of_3:1.0.0"
}
Is there a code to convert a BERT model in the file format? It seems that the current code supports only adding models from TFHub. This might allow releasing model via any file sharing service and each user can do the conversion on their own machine.