tree_transformer
tree_transformer copied to clipboard
Reproduce results on SST-2 / SST-5
Hi,
could you please provide an instruction on how to reproduce your results on SST.
It's impossible for me to navigate in your impressive codebase (its huge). E.g. 8 different attention implementations:
- default_multihead_attention.py 2. dptree_individual_multihead_attention.py 3. dptree_multihead_attention.py 4. dptree_onseq_multihead_attention.py 5. dptree_sep_multihead_attention.py 6. nstack_merge_tree_attention.py 7. nstack_tree_attention.py 8. nstack_tree_attention_eff.py
Thanks!
did you solve this problem? this codebase is too big for me to read. thank you.
Hi, very sorry we did not have time to clean up the codes. As in shown in the instruction, please follow the configuration dwnstack_merge2seq_node_iwslt_onvalue_base_upmean_mean_mlesubenc_allcross_hier to find its implementation in the files nstack_archs.py and nstack_transformer.py
Hi Nguyen, Thank you for taking the time to reply to my email, I have two more questions: 1. There are some letters representing dimensions in your code, such as tq, nq, etc., whether tq means t*q? Or there is a document explaining these parameters uniformly? 2. What's the difference between 'nstack2seq' and 'nstack_merge2seq'? I see that the encoders and decoders of the two are different, but more because the code base is too large, I really can’t figure it out.
------------------ 原始邮件 ------------------ 发件人: "nxphi47/tree_transformer" @.>; 发送时间: 2021年3月26日(星期五) 下午5:34 @.>; @.@.>; 主题: Re: [nxphi47/tree_transformer] Reproduce results on SST-2 / SST-5 (#2)
Hi, very sorry we did not have time to clean up the codes. As in shown in the instruction, please follow the configuration dwnstack_merge2seq_node_iwslt_onvalue_base_upmean_mean_mlesubenc_allcross_hier to find its implementation in the files nstack_archs.py and nstack_transformer.py
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.