Yi-Chen (Howard) Lo
Yi-Chen (Howard) Lo
Hi guys, I implemented more features based on this tutorial (e.g. batched computation for attention) and added some notes. Check out my repo here: https://github.com/howardyclo/pytorch-seq2seq-example/blob/master/seq2seq.ipynb
Hello @jsenellart. Thanks for the reply. I think there is a misunderstanding. According to the document: > On the target side, these features will be predicted by the network. The...
@jsenellart Hello, I expect the word features are not effected by the tokenizer. I think, yes, just duplicate it by concatenating back to the BPE-tokenized word.
What I want to achieve is like the paper ["Linguistic Input Features Improve Neural Machine Translation"](https://goo.gl/nYbDMM) did: 
Like the above example, as usual, my output will be like (I list them instead of concatenating by space for readability): - **Le ï¿** | Lenoidas | B | NNP...
I may figure out the feasible but a little tricky way to achieve what I want: 1. Learn BPE codes from word-level corpus **with no word features** by `tools/learn_bpe.lua`. 2....
@jsenellart Thanks, that will be great!
I wrote a feasible example python script that alleviate my requirement: ```python def align_features_for_subword_raw(raw_no_feat_subword, raw_feat, feature_delimiter = '│', subword_delimiter = 'ï¿', keep_word=False, add_subword_indicator_feature=False): feature_start_pos = 0 if keep_word else 1...
@chenyuntc Hello, I would also love to know this.