LiLT Fine-tuning on custom data

Fine-tuning on custom data

Open siamakzd opened this issue 2 years ago • 3 comments

Thank you for sharing your great work!

If I want to fine-tune on a custom dataset, what should be the steps? i.e.

What is input data format for training, testing and inference?

-Which scripts we need to modify?

Thanks in advance!

Mar 30 '22 03:03 siamakzd

Hi, I think the main steps should be:

Organize your dataset into the format of FUNSD/XFUND, depending on your dataset is monolingual/multilingual.
Put YourDataset.py under LiLTfinetune/data/datasets/. You can refer to funsd.py/xfun.py.
Put run_YourDataset_YourTask.py under examples/. You can refer to run_funsd.py/run_xfun_re.py/run_xfun_ser.py.

If you want to do something beyond training/evaluating, You can add your code to the lines after the model makes predictions, such as https://github.com/jpWang/LiLT/blob/main/examples/run_funsd.py#L345 in run_funsd.py.

Mar 30 '22 07:03 jpWang

Hi,

See also my demo notebooks here: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/LiLT

Nov 21 '22 18:11 NielsRogge

Hello,

Could you let me know when you have a Custom dataset and how to organize your dataset into the format of FUNSD/XFUND?

and do you recommend any tutorial for this step?

Thank you in advance.

May 14 '23 22:05 hamzabchiri

LiLT LiLT copied to clipboard

Fine-tuning on custom data

LiLT
LiLT copied to clipboard