djl
djl copied to clipboard
Fine-tuning a pretrained Hugginface transformer in DJL
Description
Hello,
I am currently looking into the possibility to fine-tune Hugginface transformers in DJL. I understand that now we have new hugginface tokenizer library and I tested the experimental tool you provided to transform existing fine-tuned models to torchscript and deploy them in DJL and serving. I analyzed pretty example https://github.com/deepjavalibrary/djl/blob/master/jupyter/rank_classification_using_BERT_on_Amazon_Review.ipynb . Do I understand correctly that currently, to be able to fine-tune any hugginface transformer model I need to create NLP.WORD_EMBEDDING torchscript model to use it as pretrained block, and then I can fine-tune classifier head in DJL? Or is there any other option I missed?
Will this change the current api? How?
Probably not.
Who will benefit from this enhancement?
Everyone.
Yes, this strategy of "train an NLP.WORD_EMBEDDING torchscript model to use it as pretrained block, and then fine-tune classifier head in DJL" is aligned with our design, just like that in https://github.com/deepjavalibrary/djl/blob/master/jupyter/rank_classification_using_BERT_on_Amazon_Review.ipynb. This follows the idea of transfer learning.
Basically, the torchscript model can be loaded as PtSymblicBlock, whose parameters can be obtained and fine-tuned.