awesome-align Some documentation?

Some documentation?

Open thistlillo opened this issue 8 months ago • 1 comments

The documentation is a bit scarce.

What model (model size) does the package use?
If I want to fine-tune the model, how should the training data be formatted/named/organized?

Mar 13 '25 05:03 thistlillo

Hi,

we use bert-base-multilingual-cased (https://github.com/neulab/awesome-align?tab=readme-ov-file#fine-tuning-on-parallel-data). you can also use xlm-roberta-base in this branch (https://github.com/neulab/awesome-align/tree/xlmr).
you can refer to the examples folder (https://github.com/neulab/awesome-align?tab=readme-ov-file#input-format)

Mar 13 '25 22:03 zdou0830