papernotes
papernotes copied to clipboard
Neural Arithmetic Logic Units
trafficstars
Metadata
- Authors: Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom
- Organization: DeepMind
- Conference: NIPS 2018
- Paper: https://arxiv.org/pdf/1808.00508.pdf
- Code: https://github.com/iamtrask/NALU-2
TL;DR
Present a simple module capable of learning arithmetic functions such as add, sub, mult, div, etc. And can generalize well on unseen data or unseen inference scheme.
DNNs with Non-linearities Struggle to Learn Identity Function
- Train an autoencoder to reconstruct its input ranged [-5, 5].
- All autoencoders are identical in its parameterization (3 hidden layers of size 8), only using different non linearities.
- Trained with MSE loss.
- Tested in [-20, 20], the error increase severely both below and above the range of numbers seen during training.
The Neural Accumulator (NAC) & Neural Arithmetic Logit Unit (NALU)

- NAC: A special case of linear layer, whose weight matrix W only consists of {-1, 0, 1}, defined as:
- W = tanh(\hat{W}) * σ(\hat{M})
- The elements of W are guaranteed to be [-1, 1], and biased towards {-1, 0, 1} during learning, since {-1, 0, 1} corresponds to the saturation points of either tanh(.) or σ(.)
- Its output are additions or subtractions of rows in the input vector.
- NALU: Learns a weighted sum between two sub-cells:
- One is the original NAC, capable of learning to add and subtract.
- The other one operates in log space, capable of multiply and divid, e.g., log(XY) = logX + logY; log(X/Y) = logX - log Y; exp(log(X)) = X
- Altogether, NALU can learn to perform general arithmetic operations.
Limitations of a single NALU [Ref]
- Can handle either add/subtract or mult/div operations but not a combination of both.
- For mult/div operations, it cannot handle negative targets as the mult/div gate output is the result of an exponentiation operation which always yeilds positive results.
- Power operations are only possible when the exponent is in the range of [0, 1].
Related Work
- Analysing Mathematical Reasoning Abilities of Neural Models. by David Saxton et al. DeepMind. ICLR 2019.