Representation Learning with Contrastive Predictive Coding

Open flrngel opened this issue 5 years ago • 2 comments

https://arxiv.org/abs/1807.03748 ~big fan of Aaron van den Oord~

Abstract

propose universal unsupervised learning approach to extract useful representations from high-dimensional data, CPC(Contrastive Predictive Coding)
use probabilistic contrastive loss which induce latent space to capture information that is useful to predict future samples and tractable by using negative sampling

unsupervised learning is an important stepping stone towards robust and generic representation learning
idea of predictive coding theories comes from neuroscience, Word2Vec, colorization, etc.
paper proposes
- compress high-dimensional data into compact latent embedding space which conditional predictions are easier to model
1. predict further steps with autoregressive model
2. use NCE loss
3. train CPC as end-to-end resulting model
CPC model outperforms other models in various of domains

in time series and high dimension modeling, they use next step prediction exploit the local smoothness of signal
- further in future, shared information becomes much lower and model needs to infer more global structure and these are called 'slow features'
'class size' < 'latent variable' in information

g_enc maps the input sequence of observations x_t to latent representations z_t = g_enc(x_t)
autoregressive model g_ar summarizes all z_<=t in lthe latent space and produces a context latent representation c_t = g_ar(z_<=t)

it means

proposed model, either z_t and c_t could be used as representation for downstream tasks
c_t can be used if extra content from the past is useful
z_t might not contain enough information to capture phonetic content

triplet losses using max-margin to separate positive from negative examples
time contrastive learning which minimize between embeddings from multiple viewpoints of the same scene and maximize different embeddings extracted from different timesteps
in word2vec neighbouring words are predicted using a contrastive loss

model

experiment result

CPC combines autoregressive modeling and noise-contrastive estimation with intuitions from predictive coding to learn abstract representations in an unsupervised fashion

Sep 19 '18 12:09 flrngel