eyuansu62 issues

Results 24 issues of


                                            eyuansu62

Gromov-Wasserstein Distance between 1-D vectors

question

The different results between eval mode and test mode.

Why I get the different results between eval mode and test mode? ![image](https://user-images.githubusercontent.com/30862458/168093289-78c45139-a53c-43f2-939b-56e67a66c400.png)

help wanted

Where is the code of pre-training?

Where is the code if 4 pre-training tasks? I do not find them.

Can the code be modify to distributed training?

The relationship with information bottleneck

The formula of information bottleneck is ![image](https://user-images.githubusercontent.com/30862458/171373111-e852350a-ef63-455e-b181-4d805cda47e8.png) But I do not find it in this paper. The loss function of MIB seems be derived from some definition proposed by authors....

加油加油！！

What if I just want to see how you use bi-Treelstm to encode question?

Your work is really interesting!! But all the dataset is really huge and if I just want to learn the Bi-Treelstm model to encode the context part, what can I...

illustration of FGW on trees

In the section 4.1, you mention an example about trees, and how can I process the FGW on trees? May you release the relevant code about 4.1? It will be...

What's the difference between SimCTG and CnNT?

From the core formula, SimCTG just replace the positive sample with current token itself and negative sample with previous token.

About rope embedding

why the Rotary position encodings (RoPE) was applied to 64 dimensions of each head rather full dimensions.