LiLT
LiLT copied to clipboard
Fine-tuning on custom data
Thank you for sharing your great work!
If I want to fine-tune on a custom dataset, what should be the steps? i.e.
- What is input data format for training, testing and inference?
-Which scripts we need to modify?
Thanks in advance!
Hi, I think the main steps should be:
- Organize your dataset into the format of FUNSD/XFUND, depending on your dataset is monolingual/multilingual.
- Put
YourDataset.py
underLiLTfinetune/data/datasets/
. You can refer to funsd.py/xfun.py. - Put
run_YourDataset_YourTask.py
underexamples/
. You can refer to run_funsd.py/run_xfun_re.py/run_xfun_ser.py.
If you want to do something beyond training/evaluating, You can add your code to the lines after the model makes predictions, such as https://github.com/jpWang/LiLT/blob/main/examples/run_funsd.py#L345 in run_funsd.py.
Hi,
See also my demo notebooks here: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/LiLT
Hello,
Could you let me know when you have a Custom dataset and how to organize your dataset into the format of FUNSD/XFUND?
and do you recommend any tutorial for this step?
Thank you in advance.