easy-bert icon indicating copy to clipboard operation
easy-bert copied to clipboard

Additional models for Java API

Open tovbinm opened this issue 5 years ago • 5 comments

Are there any plans to add additional models, e.g. BERT-Large, for Java API?

tovbinm avatar Jul 09 '19 19:07 tovbinm

Not the maintainer, but I think you can just use the TFHub models available off the shelf, e.g.:

from easybert import Bert
bert = Bert("https://tfhub.dev/google/bert_cased_L-24_H-1024_A-16/1") # large cased model
bert.save("/path/to/your/model/")

somerandomguyontheweb avatar Jul 10 '19 07:07 somerandomguyontheweb

In Java API the loading happens using additional dependencies. That's what I was referring to. I will update the issue description.

tovbinm avatar Jul 10 '19 14:07 tovbinm

Hey, the Large models currently aren't provided because they exceed the artifact size maximum for Sonatype's Maven Repository (1GB), so I can't distribute them through there. If you can find an alternative public Maven repository that will allow larger files I can look into posting them there.

robrua avatar Jul 11 '19 03:07 robrua

Ah, true. AFAIK, Bintray also allows only upto 250mb per jar file free for OSS projects.

Just an idea: perhaps it's possible to package a model into multiple jar files with 1GB/250mb each (depending on the hosting)? then one would have to add multiple dependencies to use a model, which is ok I suppose, e.g.:

dependencies {
    compile "com.robrua.nlp.models:easy-bert-cased-L-12-H-768-A-16_part_1_of_3:1.0.0"
    compile "com.robrua.nlp.models:easy-bert-cased-L-12-H-768-A-16_part_2_of_3:1.0.0"
    compile "com.robrua.nlp.models:easy-bert-cased-L-12-H-768-A-16_part_3_of_3:1.0.0"
}

tovbinm avatar Jul 11 '19 03:07 tovbinm

Is there a code to convert a BERT model in the file format? It seems that the current code supports only adding models from TFHub. This might allow releasing model via any file sharing service and each user can do the conversion on their own machine.

tkorach avatar Jan 25 '20 18:01 tkorach