alibi-detect icon indicating copy to clipboard operation
alibi-detect copied to clipboard

Potential inconsistency in cd_text_imdb.ipynb

Open francoispichard opened this issue 2 years ago • 1 comments

Hello,

I have a question regarding the consistency of the MMD Tensorflow detector and the MMD PyTorch detector in the cd_text_imdb.ipynb Jupyter Notebook. In my opinion, the preprocessing function passed for the class instantiation of alibi_detect.cd.MMDDrift (cf. preprocessing_fn argument) is not identical across the two frameworks (Tensorflow and PyTorch, respectively). preprocess_fn is defined as follows:

  • Tensorflow:
preprocess_fn = partial(preprocess_drift, model=uae, tokenizer=tokenizer, 
                        max_len=max_len, batch_size=32)

with

uae = UAE(input_layer=embedding, shape=shape, enc_dim=enc_dim)

such that embedding represents the embeddings derived from a transformer model in Tensorflow (cf. alibi_detect.models.tensorflow.TransformerEmbedding ) and UAE is the Tensorflow implementation of an untrained autoencoder (cf. alibi_detect.cd.tensorflow.preprocess). If I am not mistaken, there is no UAE implemented in PyTorch - I'll come back to this later.

  • PyTorch:
preprocess_fn = partial(preprocess_drift, model=model, tokenizer=tokenizer, 
                        max_len=max_len, batch_size=32, device=device)

with

model = nn.Sequential(
    embedding_pt,
    nn.Linear(768, 256),
    nn.ReLU(),
    nn.Linear(256, enc_dim)
).to(device).eval()

such that embedding_pt represents the embeddings derived from a transformer model in PyTorch (cf. alibi_detect.models.pytorch.TransformerEmbedding).

I think that the main difference between the Tensorflow and PyTorch snippets comes from the respective definitions of the model argument. Since the UAE class does not exist (yet) in alibi_detect.cd.pytorch, the PyTorch part requires us to define a dimensionality reduction step. The dimension of the embeddings is reduced by: (i) adding a dense layer on top of BERT embeddings (input dimension: 768; output dimension: 256) (ii) applying a ReLU activation function, and (iii) adding another dense layer (input dimension: 256; output dimension: enc_dim).

Although one reads that the detector with the PyTorch backend is identical to the detector with the Tensorflow backend (first sentence at the beginning of the MMD PyTorch detector > Initialize subsection), the dimensionality reduction step in PyTorch is different from that in Tensorflow, i.e., the uae object (cf. implementation of the MMD detector with the Tensorflow backend) does not reduce the dimensionality of BERT embeddings in the same way as the model object used in PyTorch. Indeed, uae is an instance of the UAE class and the associated constructor indicates that we will use a self.encoder with (i) a dense layer (input dimension: 768; output dimension: 522, since 522 = 32 + 2 x ( int([768 - 32]/3) ) = 32 + 2 x 245) (ii) a ReLU activation function (iii) a dense layer (input dimension: 522; output dimension: 277, since 277 = 32 + 1 x ( int([768 - 32]/3) ) = 32 + 1 x 245) (iv) a ReLU activation function (v) a dense layer (input dimension: 277; output dimension: 32) The above MLP is defined in the constructor of the _Encoder class.

Based on what I described here, and assuming that there is no misunderstanding on my part, is it intended that the dimensionality reduction steps have not been defined in a unique way when using the Tensorflow and PyTorch backends respectively? Many thanks for your attention! (Tagging @ascillitoe since he took over the latest issue related to cd_text_imdb.ipynb (issue #437) 😉 )

francoispichard avatar Jun 03 '22 08:06 francoispichard

Hi @francoispichard, thank you for the very nicely stated issue! I agree with you, the PyTorch and TensorFlow dimension reduction steps are functionally similar, but they are not identical. print(uae.encoder.layers[1].summary()) gives:

Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (5, 768)                  0                                                 
 dense (Dense)               (5, 522)                  401418    
 dense_1 (Dense)             (5, 277)                  144871    
 dense_2 (Dense)             (5, 32)                   8896                                                                    

Wheras, for PyTorch, torchinfo.summary(model[1:], input_size=(5,768)) shows the dimension reduction layers to be:

Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Sequential                               [5, 32]                   --
├─Linear: 1-1                            [5, 256]                  196,864
├─ReLU: 1-2                              [5, 256]                  --
├─Linear: 1-3                            [5, 32]                   8,224

At the very least we should reword the notebook to make that clearer, but as you hinted, really we should add a UAE class for PyTorch.

ascillitoe avatar Jun 07 '22 09:06 ascillitoe

At the very least we should reword the notebook to make that clearer, but as you hinted, really we should add a UAE class for PyTorch.

PR https://github.com/SeldonIO/alibi-detect/pull/656 will add a new pytorch UAE class. Following this, we should update the examples to use it.

ascillitoe avatar Oct 19 '22 15:10 ascillitoe