Yuxin Jiang 姜宇心 comments

Results 13 comments of


                                            Yuxin Jiang 姜宇心

in a supervised setting， why are the results of the test set tested during training,

Hi, I am a little confused about your question. During training, we only use the dev set for evaluation in order to save the best checkpoint. Do you mean that...

in a supervised setting， why are the results of the test set tested during training,

**Why is 5-fold cross-validation is used:** After the model is well trained, it can derive sentence embedding vectors, which can be directly used to compute the cosine similarity for STS...

HOW TO USE THIS ANALYSIS FOR MORE THEN 2 CLASSES

Firstly, You need to do modify in "model.py": change **num_labels** to 3. If the number of classes is more than 2, then AUC can not be computered by the original...

Unsupervised Model Checkpoints

Hi, Sorry for the late reply. I have released the unsupervised BERT-base checkpoint, you can download it from the link https://drive.google.com/drive/folders/1OcgJ-7gU_N7J7x5ezrigFLlTU8h7Uvjx.

Debertav2 support?

Hi, I think you may try to change 'Roberta' into 'DebertaV2' in models.py. e.g., change `from transformers.models.roberta.modeling_roberta import RobertaPreTrainedModel, RobertaModel, RobertaLMHead` to `from transformers.models.deberta_v2.modeling_deberta_v2 import DebertaV2PreTrainedModel, DebertaV2Model, DebertaV2LMHead`, and create...

Implementation Details about the Student Model

Thanks for your interest in our work. In each iteration for the student model, we start with the model trained in the last iteration.

Issues about Applying Delta

Hi, thanks for your interest in our work. The delta weights size is 25G since we use **float32** as torch_dtype, and vicuna-7b-delta-v1.1 uses **float16**. We have changed to float16 and...

problem in generate_hard_instruction.py

Hi, in most cases, the gpt response contains strings like `Instruction: xxx\nInput: xxx`. So we use `re.search(r"Instruction: (.+)\n", raw_instructions)` to extract the generated instruction. However, in some cases, the response...

About Dataset

Hi, thank you for your interest in our work :) Unfortunately, we are unable to release the dataset for distillation at this moment. Our work is still in progress, and...

adout second iteration and new_instruction

Thanks for your interest in our work. Let me illustrate it using an example: In the first iteration, the _train_pool_ and _cache_pool_ are both 52,000 Alpaca instructions and we generate...