alibi-detect
alibi-detect copied to clipboard
Potential inconsistency in cd_text_imdb.ipynb
Hello,
I have a question regarding the consistency of the MMD Tensorflow detector and the MMD PyTorch detector in the cd_text_imdb.ipynb
Jupyter Notebook. In my opinion, the preprocessing function passed for the class instantiation of alibi_detect.cd.MMDDrift
(cf. preprocessing_fn
argument) is not identical across the two frameworks (Tensorflow and PyTorch, respectively). preprocess_fn
is defined as follows:
- Tensorflow:
preprocess_fn = partial(preprocess_drift, model=uae, tokenizer=tokenizer,
max_len=max_len, batch_size=32)
with
uae = UAE(input_layer=embedding, shape=shape, enc_dim=enc_dim)
such that embedding
represents the embeddings derived from a transformer model in Tensorflow (cf. alibi_detect.models.tensorflow.TransformerEmbedding
) and UAE
is the Tensorflow implementation of an untrained autoencoder (cf. alibi_detect.cd.tensorflow.preprocess
). If I am not mistaken, there is no UAE implemented in PyTorch - I'll come back to this later.
- PyTorch:
preprocess_fn = partial(preprocess_drift, model=model, tokenizer=tokenizer,
max_len=max_len, batch_size=32, device=device)
with
model = nn.Sequential(
embedding_pt,
nn.Linear(768, 256),
nn.ReLU(),
nn.Linear(256, enc_dim)
).to(device).eval()
such that embedding_pt
represents the embeddings derived from a transformer model in PyTorch (cf. alibi_detect.models.pytorch.TransformerEmbedding
).
I think that the main difference between the Tensorflow and PyTorch snippets comes from the respective definitions of the model
argument. Since the UAE class does not exist (yet) in alibi_detect.cd.pytorch
, the PyTorch part requires us to define a dimensionality reduction step. The dimension of the embeddings is reduced by:
(i) adding a dense layer on top of BERT embeddings (input dimension: 768; output dimension: 256)
(ii) applying a ReLU activation function, and
(iii) adding another dense layer (input dimension: 256; output dimension: enc_dim
).
Although one reads that the detector with the PyTorch backend is identical to the detector with the Tensorflow backend (first sentence at the beginning of the MMD PyTorch detector > Initialize
subsection), the dimensionality reduction step in PyTorch is different from that in Tensorflow, i.e., the uae
object (cf. implementation of the MMD detector with the Tensorflow backend) does not reduce the dimensionality of BERT embeddings in the same way as the model
object used in PyTorch. Indeed, uae
is an instance of the UAE
class and the associated constructor indicates that we will use a self.encoder
with
(i) a dense layer (input dimension: 768; output dimension: 522, since 522 = 32 + 2 x ( int([768 - 32]/3) ) = 32 + 2 x 245)
(ii) a ReLU activation function
(iii) a dense layer (input dimension: 522; output dimension: 277, since 277 = 32 + 1 x ( int([768 - 32]/3) ) = 32 + 1 x 245)
(iv) a ReLU activation function
(v) a dense layer (input dimension: 277; output dimension: 32)
The above MLP is defined in the constructor of the _Encoder
class.
Based on what I described here, and assuming that there is no misunderstanding on my part, is it intended that the dimensionality reduction steps have not been defined in a unique way when using the Tensorflow and PyTorch backends respectively? Many thanks for your attention! (Tagging @ascillitoe since he took over the latest issue related to cd_text_imdb.ipynb (issue #437) 😉 )
Hi @francoispichard, thank you for the very nicely stated issue! I agree with you, the PyTorch and TensorFlow dimension reduction steps are functionally similar, but they are not identical. print(uae.encoder.layers[1].summary())
gives:
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (5, 768) 0
dense (Dense) (5, 522) 401418
dense_1 (Dense) (5, 277) 144871
dense_2 (Dense) (5, 32) 8896
Wheras, for PyTorch, torchinfo.summary(model[1:], input_size=(5,768))
shows the dimension reduction layers to be:
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
Sequential [5, 32] --
├─Linear: 1-1 [5, 256] 196,864
├─ReLU: 1-2 [5, 256] --
├─Linear: 1-3 [5, 32] 8,224
At the very least we should reword the notebook to make that clearer, but as you hinted, really we should add a UAE
class for PyTorch.
At the very least we should reword the notebook to make that clearer, but as you hinted, really we should add a
UAE
class for PyTorch.
PR https://github.com/SeldonIO/alibi-detect/pull/656 will add a new pytorch UAE
class. Following this, we should update the examples to use it.