awesome-align
awesome-align copied to clipboard
Some documentation?
The documentation is a bit scarce.
- What model (model size) does the package use?
- If I want to fine-tune the model, how should the training data be formatted/named/organized?
Hi,
- we use bert-base-multilingual-cased (https://github.com/neulab/awesome-align?tab=readme-ov-file#fine-tuning-on-parallel-data). you can also use xlm-roberta-base in this branch (https://github.com/neulab/awesome-align/tree/xlmr).
- you can refer to the examples folder (https://github.com/neulab/awesome-align?tab=readme-ov-file#input-format)