Siamese-Recurrent-Architectures icon indicating copy to clipboard operation
Siamese-Recurrent-Architectures copied to clipboard

Usage of Siamese Recurrent Neural network architectures for semantic textual similarity

Siamese-Recurrent-Architectures

Siamese networks are networks that have two or more identical sub-networks in them. Siamese networks seem to perform well on similarity tasks and have been used for tasks like sentence semantic similarity, recognizing forged signatures and many more. This paper offers a pretty straightforward approach to the common problem of sentence similarity using simamese network, named MaLSTM (“Ma” for Manhattan distance)

This project tries to dig deep in to siamese networks to outperform the MALSTM performance. Following architectures have been considered.


MALSTM (Original Paper)

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adagrad False False word2vec 0.153 0.809 0.741
Adadelta True True word2vec 0.156 0.802 0.733

denotes the original paper.


MAGRU

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adadelta True True word2vec 0.140 0.838 0.780

MABILSTM

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adadelta True False word2vec 0.164 0.784 0.708

MABIGRU

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adadelta True True word2vec 0.143 0.832 0.773

MALSTM-ATTENTION

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adagrad False False word2vec 0.145 0.827 0.765

MAGRU-ATTENTION

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adadelta False False word2vec 0.149 0.818 0.751

MAGRU-CAPSULE

Architecture

Alt text

The experiments can be found here. Best result was,

Optimizer Transfer Learning Augmentation Embedding RMSE Pearson Spearman
Adadelta False False word2vec 0.156 0.806 0.733