torch-light Confused by this conv1d operation

Confused by this conv1d operation

Open airkid opened this issue 5 years ago • 1 comments

Hi, I'm reading this code for study and it helps me a lot. I'm confused by this line: https://github.com/ne7ermore/torch-light/blob/254c1333eef5ee35a1b5e036f267b81ddad17f96/BERT/model.py#L74

from the source paper of BERT, I've not found any description that BERT use a conv1d layer in transformer instead of linear transformation.

And from http://nlp.seas.harvard.edu/2018/04/03/attention.html#position-wise-feed-forward-networks, this is implement by a mlp.

Can anyone kindly help me with this problem?

May 18 '19 07:05 airkid

It is the same

airkid [email protected] 于2019年5月18日周六下午3:20写道：

Hi, I'm reading this code for study and it helps me a lot. I'm confused by this line:

https://github.com/ne7ermore/torch-light/blob/254c1333eef5ee35a1b5e036f267b81ddad17f96/BERT/model.py#L74

from the source paper of BERT, I've not found any description that BERT use a conv1d layer in transformer instead of linear transformation.

And from http://nlp.seas.harvard.edu/2018/04/03/attention.html#position-wise-feed-forward-networks, this is implement by a mlp.

Can anyone kindly help me with this problem?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ne7ermore/torch-light/issues/9?email_source=notifications&email_token=AF6W56X5T5P2MDZLUOSUIFTPV6U3HA5CNFSM4HNZXFYKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUQVMWA, or mute the thread https://github.com/notifications/unsubscribe-auth/AF6W56RMABSJNIUKBJQ5WJDPV6U3HANCNFSM4HNZXFYA .

May 19 '19 02:05 ne7ermore

torch-light torch-light copied to clipboard

Confused by this conv1d operation

torch-light
torch-light copied to clipboard