Zi-Yi Dou

Results 58 comments of Zi-Yi Dou

Hi, Thanks for the interest! In the demo, we only print the aligned word pairs, thus most words in the second English sentence are now showing up because our model...

> I have a similar question: what if a src token is not aligned to any target token (or tgt token not aligned to any src token)? If so, how...

Thanks for the contribution! 1. I actually considered that case in https://github.com/neulab/awesome-align/blob/master/run_train.py#L170-L189, basically I cut the sentence lengths to max_len/2 when combining them, so that might not be necessary? 2....

Hi, what are your training command and the size of your training data? it is possible that your data is too large, in which case you can just subsample a...

Thanks! I guess I can start working on this after #26 is fixed?

Hi, I didn't try this before but I think our method can be directly applied to the encoder of a seq2seq model. Also, for MT models, we can use both...

Hi, if you are using an MT model like nllb, given a sentence pair (x, y), you can obtain contextualized word embeddings for x and y by: 1. feeding to...

Hi, right now the repo only supports mBERT and [XLM-R](https://github.com/neulab/awesome-align/tree/xlmr). You can check this [commit](https://github.com/neulab/awesome-align/commit/337f684a4ebebb07c1b50ddc4dcc1e73442753ce) to see how to incorporate a new model.

Thanks! Right now we don't have plans for releasing the merged attention model weights, but we may release the code in the future.

Hi, the config settings should be the same. Here is the file that I used for distributed training https://github.com/zdou0830/METER/blob/main/azure_distributed_run.py, not sure if it helps.