Mask-Align icon indicating copy to clipboard operation
Mask-Align copied to clipboard

How to generate alignment based on token level?

Open michelleqyhqyh opened this issue 2 years ago • 2 comments

I can use your command to generate alignment based on bpe level. But how to generate alignment based on token level?

michelleqyhqyh avatar Sep 27 '22 03:09 michelleqyhqyh

By default, the generated alignment is already word-level, as indicated by https://github.com/THUNLP-MT/Mask-Align/blob/main/thualign/utils/alignment.py#L168 with the remove_bpe parameter of the weights_to_align method.

carboncoo avatar Sep 28 '22 06:09 carboncoo

This is hard coded because we normally only care about word-level alignment. You can change the default value of remove_bpe by yourself.

在2022-09-28 @.***写道:

By default, the generated alignment is already word-level, as indicated by https://github.com/THUNLP-MT/Mask-Align/blob/main/thualign/utils/alignment.py#L168 with the remove_bpe parameter of the weights_to_align method.

Where can I set "remove_bpe" parameter?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

carboncoo avatar Sep 28 '22 23:09 carboncoo