djl icon indicating copy to clipboard operation
djl copied to clipboard

Fine-tuning a pretrained Hugginface transformer in DJL

Open oskoumal opened this issue 2 years ago • 1 comments

Description

Hello,

I am currently looking into the possibility to fine-tune Hugginface transformers in DJL. I understand that now we have new hugginface tokenizer library and I tested the experimental tool you provided to transform existing fine-tuned models to torchscript and deploy them in DJL and serving. I analyzed pretty example https://github.com/deepjavalibrary/djl/blob/master/jupyter/rank_classification_using_BERT_on_Amazon_Review.ipynb . Do I understand correctly that currently, to be able to fine-tune any hugginface transformer model I need to create NLP.WORD_EMBEDDING torchscript model to use it as pretrained block, and then I can fine-tune classifier head in DJL? Or is there any other option I missed?

Will this change the current api? How?

Probably not.

Who will benefit from this enhancement?

Everyone.

oskoumal avatar Jun 27 '23 06:06 oskoumal

Yes, this strategy of "train an NLP.WORD_EMBEDDING torchscript model to use it as pretrained block, and then fine-tune classifier head in DJL" is aligned with our design, just like that in https://github.com/deepjavalibrary/djl/blob/master/jupyter/rank_classification_using_BERT_on_Amazon_Review.ipynb. This follows the idea of transfer learning.

Basically, the torchscript model can be loaded as PtSymblicBlock, whose parameters can be obtained and fine-tuned.

KexinFeng avatar Jun 27 '23 17:06 KexinFeng