qt icon indicating copy to clipboard operation
qt copied to clipboard

How important is it to use sentence_output_nheads?

Open hjian42 opened this issue 2 years ago • 0 comments

From the code https://github.com/stangelid/qt/blob/c136ac00e03adf443b90cd65ba0523a3617be01f/src/encoders.py#L37, the way of creating H, the number of sentence heads of the encoder, is to add a linear + norm layer to transform the input of the CLS token from (batch_size, nsents, 1, model_d) into (batch_size, nsents, sentence_output_nheads, new_model_d). I wonder why do we need this extra layer, instead of feeding the original input (batch_size, nsents, 1, model_d) into the quantization layer?

I feel that you probably did experiments with it and chose this design and want to hear more. Thanks for your response in advance.

hjian42 avatar Jul 14 '22 18:07 hjian42