Krishna Pillutla

Results 15 comments of Krishna Pillutla

Hi @samhedin, you are right. > The activations in the final hidden layer is taken: `outs.hidden_states[-1]`, right? Correct > Looking at `hidden_state[0]` is looking at the embedding of the first...

I would ignore this warning (note that we do not care about estimating the centroids).

We simply take the encoding of the last token (this is true for both RoBERTa and GPT-2).

Hi @samhedin, We did most of our experimentation with GPT-2, so we chose the last token. We stuck to the same for RoBERTa to avoid unnecessary confounding factors. Through all...

Good point. In principle, MAUVE simply measures the gap between two distributions via embeddings of their samples. You could use MAUVE for other modalities and more generally, any two distributions...

Hi @nostalgebraist, The settings in your notebook look reasonable to me. In general, MAUVE is meant to compare two or more settings. In contrast, the absolute value of MAUVE is...

Hi @nostalgebraist, Thank you for these detailed notes (and sorry for the slow response). Here is my intuition: in order to quantify subtle errors, the number of errors must be...

The current update is mathematically incorrect. The bug is hard to detect because it does not affect the loss or predictions by a noticeable amount. However, the weights show a...

@zhangjf-nlp Thank you for the implementation (and sorry I missed your earlier comment)! A GPU implementation of k-means would be most welcome from your or @mpreziuso. However, could you implement...

Hi @zhangjf-nlp, thank you for the contribution! Let us continue the discussion on the pull request [here](https://github.com/krishnap25/mauve/pull/16#issuecomment-2094638443).