tf-nlp-blocks

Author: Han Xiao https://hanxiao.github.io

A collection of frequently-used deep learning blocks I have implemented in Tensorflow. It covers the core tasks in NLP such as embedding, encoding, matching and pooling. All implementations follow a modularized design pattern which I called the "block-design". More details can be found in my blog post.

Requirements
Contents
- encode_blocks.py
- match_blocks.py
- pool_blocks.py
- embed_blocks.py
- mulitask_blocks.py
- nn.py
Run

Requirements

Python >= 3.6
Tensorflow >= 1.6

`encode_blocks.py`

A collection of sequence encoding blocks. Input is a sequence with shape of [B, L, D], output is another sequence in [B, L, D'], where B is batch size, L is the length of the sequence and D and D' are the dimensions.

Name	Dependencies	Description	Reference
`LSTM_encode`		a fast multi-layer bidirectional LSTM implementation based on `CudnnLSTM`. Expect to be 5~10x faster than the standard tf `LSTMCell`. However, it can only run on GPU.	Tensorflow doc on `CudnnLSTM`
`TCN_encode`	`Res_DualCNN_encode`	a temporal convolution network described in the paper, basically a multi-layer dilated CNN with special padding to ensure the causality	An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
`Res_DualCNN_encode`	`CNN_encode`	a sub-block used by `TCN_encode`. It is a two-layer CNN with spatial dropout in-between, then followed by a residual connection and a layer-norm.	An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
`CNN_encode`		a standard `conv1d` implementation on `L` axis, with the possibility to set different paddings	Convolutional Neural Networks for Sentence Classification

`match_blocks.py`

A collection of sequence matching blocks, aka. attention. Input are two sequnces: context in the shape of [B, L_c, D], and query in the shape of [B, L_q, D]. The output is a sequence has the same length as context, i.e. with shape of [B, L_c, D]. Each position in the output should encodes the relevance of that position in context to the complete query.

Name	Dependencies	Description	Reference
`Attentive_match`		basic attention mechanism with different scoring functions, also supports future blinding.	`additive`: Neural machine translation by jointly learning to align and translate; `scaled`: Attention is all you need
`Transformer_match`		a multi-head attention block from "Attention is all you need"	Attention is all you need
`AttentiveCNN_match`	`Attentive_match`	the light version of attentive convolution, with the possibility of future blinding to ensure causality.	Attentive Convolution
`BiDaf_match`		attention flow layer used in bidaf model.	Bidirectional Attention Flow for Machine Comprehension

`pool_blocks.py`

A collection of pooling blocks. It fuses/reduces on the time axis L. Input is a sequence with shape of [B, L, D], output is in [B, D].

Name	Dependencies	Description	Reference
`SWEM_pool`		do pooling on the input sequence, supports max/avg. pooling, hierarchical avg. max pooling.	Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

There are also some convolution-based pooling blocks build on SWEM_pool, but they are for experimental purpose. Thus, I will not list them here.

`embed_blocks.py`

A collection of positional encoding on the sequence.

Name	Dependencies	Description	Reference
`SinusPositional_embed`		generate a sinusoid signal that has the same length of the input sequence	Attention is all you need
`Positional_embed`		parameterize the absolute position of the tokens in the input sequence	A Convolutional Encoder Model for Neural Machine Translation

`mulitask_blocks.py`

A collection of multi-task learning blocks. So far only the "cross-stitch block" is available.

Name	Dependencies	Description	Reference
`CrossStitch`		a cross-stitch block, modeling the correlation & self-correlation of two tasks	Cross-stitch Networks for Multi-task Learning
`Stack_CrossStitch`	`CrossStitch`	stacking multiple cross-stitch blocks together with shared/separated input	Cross-stitch Networks for Multi-task Learning

`nn.py`

A collection of auxiliary functions, e.g. masking, normalizing, slicing.

Run

Run app.py for a simple test on toy data.

tf-nlp-blocks
tf-nlp-blocks copied to clipboard

Metadata

tf-nlp-blocks

Requirements

Contents

`encode_blocks.py`

`match_blocks.py`

`pool_blocks.py`

`embed_blocks.py`

`mulitask_blocks.py`

`nn.py`

Run

← Metadata

Owner

Metadata

tf-nlp-blocks tf-nlp-blocks copied to clipboard

Metadata

tf-nlp-blocks

Requirements

Contents

encode_blocks.py

match_blocks.py

pool_blocks.py

embed_blocks.py

mulitask_blocks.py

nn.py

Run

← Metadata

Owner

Metadata

tf-nlp-blocks
tf-nlp-blocks copied to clipboard

`encode_blocks.py`

`match_blocks.py`

`pool_blocks.py`

`embed_blocks.py`

`mulitask_blocks.py`

`nn.py`