vert-papers a experiment about meta-test

Hi. I delete Line 384-385 and Line 447 of learner.py to avoid fine-tuning in the support set during the meta-test? Is this right? Thanks.

Sep 20 '22 15:09 wjczf123

Hi @wjczf123, yeah. If you remove the #384-385 and #447 of learner.py, the code'll skip fine-tuning on meta-test support set.

Sep 21 '22 08:09 iofu728

Thanks for your reply. I ran it once under inter 5-way 1-shot setting and the results looked very bad.

2022-09-20 22:57:35 INFO: - span_f1 = 0.7218073781712385 2022-09-20 22:57:35 INFO: - span_p = 0.7370060346505719 2022-09-20 22:57:35 INFO: - span_r = 0.7072229140722269 2022-09-20 22:57:35 INFO: - type_f1 = 0.156973848019738 2022-09-20 22:57:35 INFO: - type_p = 0.156973848069738 2022-09-20 22:57:35 INFO: - type_r = 0.156973848069738 2022-09-20 22:57:35 INFO: - 9.445,9.063,9.250,73.701,70.722,72.181,15.697,15.697,15.697,0.000,0.000,0.000

Sep 21 '22 13:09 wjczf123

I understand the performance will drop, but it performs poorly.

Sep 21 '22 13:09 wjczf123

Thanks for your reply. I ran it once under inter 5-way 1-shot setting and the results looked very bad.

2022-09-20 22:57:35 INFO: - span_f1 = 0.7218073781712385 2022-09-20 22:57:35 INFO: - span_p = 0.7370060346505719 2022-09-20 22:57:35 INFO: - span_r = 0.7072229140722269 2022-09-20 22:57:35 INFO: - type_f1 = 0.156973848019738 2022-09-20 22:57:35 INFO: - type_p = 0.156973848069738 2022-09-20 22:57:35 INFO: - type_r = 0.156973848069738 2022-09-20 22:57:35 INFO: - 9.445,9.063,9.250,73.701,70.722,72.181,15.697,15.697,15.697,0.000,0.000,0.000

Sorry, I made a mistake earlier. You can't direct remove #447 in the type classification stage, which has some logit to generate the type embedding. The solution should be keep #447, and change #165 to self.model.eval(). You may also need to remove #191-192

Sep 22 '22 02:09 iofu728

Thanks. The new result seems to be correct.

2022-09-24 20:43:14 INFO: - ***** Eval results inter-test ***** 2022-09-24 20:43:14 INFO: - f1 = 0.6104350036041772 2022-09-24 20:43:14 INFO: - f1_threshold = 0.6133144703132174 2022-09-24 20:43:14 INFO: - loss = tensor(4.1757, device='cuda:0') 2022-09-24 20:43:14 INFO: - precision = 0.6232885601193933 2022-09-24 20:43:14 INFO: - precision_threshold = 0.6340790479672884 2022-09-24 20:43:14 INFO: - recall = 0.5981008717310069 2022-09-24 20:43:14 INFO: - recall_threshold = 0.5938667496886657 2022-09-24 20:43:14 INFO: - span_f1 = 0.7218073781712385 2022-09-24 20:43:14 INFO: - span_p = 0.7370060346505719 2022-09-24 20:43:14 INFO: - span_r = 0.7072229140722269 2022-09-24 20:43:14 INFO: - type_f1 = 0.8474159401741568 2022-09-24 20:43:14 INFO: - type_p = 0.8474159402241568 2022-09-24 20:43:14 INFO: - type_r = 0.8474159402241568 2022-09-24 20:43:14 INFO: - 62.329,59.810,61.044,73.701,70.722,72.181,84.742,84.742,84.742,63.408,59.387,61.331

Sep 24 '22 12:09 wjczf123

I am very sorry to interrupt again. Why is the performance in 5-shot worse than 1-shot after ablating fine-tuning in meta-test? For example, the F1 under inter 5way-5shot is about 54. However, the performance under 1-shot is 61.04. Have you ever observed this phenomenon before? It doesn't seem normal. Thanks.

Sep 30 '22 04:09 wjczf123

Hi @wjczf123, this may be reasonable, although we have not done the corresponding ablation experiments on 5shot. First of all the 5shot and 1shot datasets cannot be compared in parallel, they are both just a sampled subset of Few-NERD. Of course, according to our experimental results on inter 5-1 and inter 5-5, it seems that the 5shot results are better. Secondly, we found in our experiments that more fine-tuning steps are needed for inter 5-5 and inter 10-5 in the meta-test. Removing the fine-tune may have a greater impact on 5shot. Hope this helps.

Sep 30 '22 04:09 iofu728

Thanks. Hope you have a good day.

Sep 30 '22 04:09 wjczf123

vert-papers vert-papers copied to clipboard

a experiment about meta-test

vert-papers
vert-papers copied to clipboard