transformer issues

how to generate train, src, tgt,or how to run the code

2

wangyy161

Can't run the whole project

......?

MsTao-68

Thanks for your sharing, but I have a question that how to use it ?

2

how to use it ?

zhhhzhang

difference between paper and your code

1. a dropout between two FC in FFN 2. In the embedding layers, you should multiply those weights by sqrt(d_model). ![image](https://user-images.githubusercontent.com/13014170/131800776-5bfad1f0-2e4e-4562-852d-0538e810a9f9.png)

yuanyihan

How to add some functionalities to this code?

1

Hi. Please i would like to add some features to this code that i have read on beam search. Like coverage penalty and length normalization. But i don't know where...

liperrino

I don`t want to debug...

The test set is useless and there are lots of bugs....

laichengen

Some error in position encoding

The error "ValueError: only one element tensors can be converted to Python scalars" occurred in L79: input_pos = tensor([list(range(1, len+1)) + [0]*(max_len-len) for len in input_len]) in modules.py. I want...

HN123-123

why dose this repo use the earlier labels as the input of Decoder?

''' train.py line 104 enc_inputs, enc_inputs_len = batch.src dec_, dec_inputs_len = batch.trg dec_inputs = dec_[:, :-1] dec_targets = dec_[:, 1:] dec_inputs_len = dec_inputs_len - 1 ''' In the original paper...

qq563902455

TypeError: on init() missing require positional argument: out_features

3

python3 train.py -model_path models -data_path models/preprocess-train.t7 Namespace(batch_size=128, d_ff=2048, d_k=64, d_model=512, d_v=64, data_path='models/preprocess-train.t7', display_freq=100, dropout=0.1, log=None, lr=0.0002, max_epochs=10, max_grad_norm=None, max_src_seq_len=50, max_tgt_seq_len=50, model_path='models', n_heads=8, n_layers=6, n_warmup_steps=4000, share_embs_weight=False, share_proj_weight=False, weighted_model=False) Loading training and...

liperrino

improved transformer modules

Sa-asus

transformer
transformer copied to clipboard

Metadata

how to generate train, src, tgt,or how to run the code

Can't run the whole project

Thanks for your sharing, but I have a question that how to use it ?

difference between paper and your code

How to add some functionalities to this code?

I don`t want to debug...

Some error in position encoding

why dose this repo use the earlier labels as the input of Decoder?

TypeError: on init() missing require positional argument: out_features

improved transformer modules

← Metadata

Owner

Metadata

transformer transformer copied to clipboard

Metadata

← Metadata

Owner

Metadata

transformer
transformer copied to clipboard