next_word_prediction
next_word_prediction copied to clipboard
model taking so much time on loading, i want to reduce it for deployment purpose.
Can just use bert-base-uncased, comment out/ remove other models. In my tests, bert(500M?) is good enough to use.