keyphrase-generation-rl icon indicating copy to clipboard operation
keyphrase-generation-rl copied to clipboard

ndcg_array = dcg_array / dcg_max_array

Open SaidaSaad opened this issue 5 years ago • 16 comments

Hello i would like to know after we use the command for computing the evaluation scores on prediction files there is any explanation for that ? I got that erros RuntimeWarning: invalid value encountered in true_divide ndcg_array = dcg_array / dcg_max_array RuntimeWarning: invalid value encountered in true_divide alpha_ndcg_array = alpha_dcg_array / alpha_dcg_max_array henlo henlo henlo

Thanks

SaidaSaad avatar Dec 12 '19 15:12 SaidaSaad

It sounds like some of the values in dcg_max_array are zero or NaN. Can you please wrap the ndcg_array = dcg_array / dcg_max_array line into a try execpt block and print out the values of dcg_max_array? If some of the values are zero, maybe you try to do the division in an element-wise way and set the value to zero if dcg_max_array[i]=0.

kenchan0226 avatar Dec 14 '19 11:12 kenchan0226

Yes I printed dcg_max_array and alpha_dcg_max_array and some of the valuse are zeros. so I edited that that line alpha_ndcg_array = alpha_dcg_array/ alpha_dcg_max_array to be

alpha_ndcg_array = np.zeros(shape=(2,)) for i in range(2): if alpha_dcg_max_array[i] == 0 and np.isnan(alpha_dcg_max_array[i]): alpha_ndcg_array[i] = 0 else: alpha_ndcg_array[i] = alpha_dcg_array[i] / alpha_dcg_max_array[i] alpha_ndcg_array = np.nan_to_num(alpha_ndcg_array)

I did also the same for ndcg_max_array , I got no error but it just print

henlo henlo henlo

and nothing else. Did i do something wrong ?

Thanks

SaidaSaad avatar Dec 18 '19 12:12 SaidaSaad

This what i got 22

SaidaSaad avatar Dec 18 '19 12:12 SaidaSaad

Hi, I double-checked the code. The script works even some of the values are zeros, it will just print out a warning, but not an error. The np.nan_to_num(ndcg_array) in the original code have already handled the issue of division by zero. I think you can simply use the original evaluate_prediction.py in this github and try it. It should not print out "henlo" since I cannot find any "henlo" in the entire project.

kenchan0226 avatar Dec 18 '19 13:12 kenchan0226

Yes , the original code give me the first error i mentioned in the first question as you can see here 545

SaidaSaad avatar Dec 18 '19 13:12 SaidaSaad

I fixed that by what i told you in the last comment Please let me know if there is something wrong i did

I also would like to know where i can find the scores for the evaluation ?

Thanks

SaidaSaad avatar Dec 18 '19 13:12 SaidaSaad

Yes, I know, but my point is the division by zero is just a warning, but not an error, it will not cause the script to terminate, and the NaN value will be corrected by np.nan_to_num(ndcg_array). So you don't need to fix the issue of division by zero. Can you use the original code and use the -exp_path option to specify a path. Run the script, then it will create a results_log_*.txt file in the directory specified by the -exp_path. I am sorry that I did not make it clear in the readme. If you simply use the default arguments in our readme, it will create a results_log_*.txt file in exp/kp20k.[timestamp] folder. Actually, the value of the -exp_path argument printed out in your screen will show you path that stores the results file.

kenchan0226 avatar Dec 18 '19 14:12 kenchan0226

Thank you very much , yes , you are right , those were just warning Now Yes I got the result file , One more last question :)

I am computing the scores in for word_inspec test set prediction

I used this comman to compute the prediction -pred_file_path pred/predict.kp20k.one2many.cat.copy.bi-directional.20191212-151234/predictions.txt -trg_file_path data/cross_domain_sorted/word_inspec_testing_allkeywords.txt -src_file_path data/cross_domain_sorted/word_inspec_testing_context.txt -exp kp20k -export_filtered_pred -disable_extra_one_word_filter -invalidate_unk -all_ks 5 M -present_ks 5 M -absent_ks 5 M

In this results file, results_log_5_M_5_M_5_M.txt

Can you please explain it. I saw it dived to all, present, absent and MAE stat. Can you explain for example the one for Present and where exactly the final F1@5, F1@M and alph-nDCG@5 for present keyphrase. And what is Micro and Macro mean.

Thanks I appreciated your help

SaidaSaad avatar Dec 18 '19 14:12 SaidaSaad

Hi, the results under ====all==== are the F1 scores for the prediction includes both present and absent keyphrases. The results under ====present==== are the F1 scores for present predicted keyphrases only. The results under ====absent==== are the F1 scores for absent predicted keyphreases only. The results under ====MAE==== are the results about the ability of the model to predict the correct number of keyphreases.

kenchan0226 avatar Dec 19 '19 03:12 kenchan0226

Yes I understood that but what I did not understand is that for example for present keyphrases I found two F1@5 F1@5=0.16388 (Micro) and F1@5=0.18448(Macro) what is the difference and which one you used to evaluate

==================================present==================================== #predictions after filtering: 2375 #predictions after filtering per src:4.750 #unique targets: 3602 #unique targets per src:7.204 Begin===============classification metrics present@5===============Begin #target: 2500, #predictions: 3602, #corrects: 500 Micro: P@5=0.2 R@5=0.13881 F1@5=0.16388 Macro: P@5=0.2 R@5=0.1712 F1@5=0.18448 Begin===============classification metrics present@M===============Begin #target: 2375, #predictions: 3602, #corrects: 511 Micro: P@M=0.21516 R@M=0.14187 F1@M=0.17099 Macro: P@M=0.27203 R@M=0.17369 F1@M=0.21201 Begin==================Ranking metrics present@5==================Begin MAP@5=0.13014 NDCG@5=0.49478 AlphaNDCG@5=0.7968 Begin==================Ranking metrics present@M==================Begin MAP@M=0.13103 NDCG@M=0.49963 AlphaNDCG@M=0.82678

SaidaSaad avatar Dec 19 '19 11:12 SaidaSaad

Hi, we report the macro F1 scores in our paper since they are used by previous keyphrase generation literature. Macro F1 scores and micro F1 scores are two different ways to aggregate the F1 score of each individual test sample into one score. You can check the following two urls for the details.

http://rushdishams.blogspot.com/2011/08/micro-and-macro-average-of-precision.html https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin

kenchan0226 avatar Dec 19 '19 11:12 kenchan0226

Hello Kenchan

I have another question,I would to ask when we are the prediction using this command 👍

catSeq on inspec dataset: python3 interactive_predict.py -vocab data/kp20k_sorted/ -src_file data/cross_domain_sorted/word_inspec_testing_context.txt -pred_path pred/%s.%s -copy_attention -one2many -one2many_mode 1 -model [path_to_model] -max_length 60 -remove_title_eos -n_best 1 -max_eos_per_output_seq 1 -beam_size 1 -batch_size 20 -replace_unk

why we shall pass the dateset itself -vocab data/kp20k_sorted/ in that command . I think it should be enough to pass the model and the Test data set only , So Could you please explain me more why it is necessary?

SaidaSaad avatar May 06 '20 05:05 SaidaSaad

Our source code separately save the word2idx and idx2word dictionary in the vocab.pt, they are not inside our saved model, so we still need to load it.

kenchan0226 avatar May 07 '20 05:05 kenchan0226

Thanks for your reply , I would like to know if your source code is saving word2idx and idx2word during the training of the model? Another question In case if i got a model which trained on part of the dataset , Can I still be able use the same command to get the predictions? Thank you :)

SaidaSaad avatar May 07 '20 08:05 SaidaSaad

word2idx and idx2word are saved by the preprocessing scripts. I think you can still use the same command to get the predictions.

kenchan0226 avatar May 11 '20 01:05 kenchan0226

predict.txt can you run

Struggle-lsl avatar Nov 12 '22 11:11 Struggle-lsl