s2e-coref
s2e-coref copied to clipboard
Hello, may I ask why dev can't be used to verify the validation set using the model you gave me? Keyerror will be reported. If I ignore keyerror, AttributeError will also be generated
Just to elaborate on what I perceive to be @zyt888's issue. When I run the following with export SPLIT_FOR_EVAL=test
:
python run_coref.py \
--output_dir=$OUTPUT_DIR \
--cache_dir=$CACHE_DIR \
--model_type=longformer \
--model_name_or_path=$MODEL_DIR \
--tokenizer_name=allenai/longformer-large-4096 \
--config_name=allenai/longformer-large-4096 \
--train_file=$DATA_DIR/train.english.jsonlines \
--predict_file=$DATA_DIR/test.english.jsonlines \
--do_eval \
--num_train_epochs=129 \
--logging_steps=500 \
--save_steps=3000 \
--eval_steps=1000 \
--max_seq_length=4096 \
--train_file_cache=$DATA_DIR/train.english.4096.pkl \
--predict_file_cache=$DATA_DIR/test.english.4096.pkl \
--amp \
--normalise_loss \
--max_total_seq_len=5000 \
--experiment_name=eval_model \
--warmup_steps=5600 \
--adam_epsilon=1e-6 \
--head_learning_rate=3e-4 \
--learning_rate=1e-5 \
--adam_beta2=0.98 \
--weight_decay=0.01 \
--dropout_prob=0.3 \
--save_if_best \
--top_lambda=0.4 \
--tensorboard_dir=$OUTPUT_DIR/tb \
--conll_path_for_eval=$DATA_DIR/$SPLIT_FOR_EVAL.english.v4_gold_conll
I get the following output:
08/30/2022 21:51:23 - INFO - __main__ - model_type - longformer
08/30/2022 21:51:23 - INFO - __main__ - model_name_or_path - /home/tanner/git/s2e-coref/model
08/30/2022 21:51:23 - INFO - __main__ - output_dir - output
08/30/2022 21:51:23 - INFO - __main__ - train_file_cache - /home/tanner/conll/conll-2012/english/train.english.4096.pkl
08/30/2022 21:51:23 - INFO - __main__ - predict_file_cache - /home/tanner/conll/conll-2012/english/test.english.4096.pkl
08/30/2022 21:51:23 - INFO - __main__ - train_file - /home/tanner/conll/conll-2012/english/train.english.jsonlines
08/30/2022 21:51:23 - INFO - __main__ - predict_file - /home/tanner/conll/conll-2012/english/test.english.jsonlines
08/30/2022 21:51:23 - INFO - __main__ - config_name - allenai/longformer-large-4096
08/30/2022 21:51:23 - INFO - __main__ - tokenizer_name - allenai/longformer-large-4096
08/30/2022 21:51:23 - INFO - __main__ - cache_dir - cache
08/30/2022 21:51:23 - INFO - __main__ - max_seq_length - 4096
08/30/2022 21:51:23 - INFO - __main__ - do_train - False
08/30/2022 21:51:23 - INFO - __main__ - do_eval - True
08/30/2022 21:51:23 - INFO - __main__ - do_lower_case - False
08/30/2022 21:51:23 - INFO - __main__ - nonfreeze_params - None
08/30/2022 21:51:23 - INFO - __main__ - learning_rate - 1e-05
08/30/2022 21:51:23 - INFO - __main__ - head_learning_rate - 0.0003
08/30/2022 21:51:23 - INFO - __main__ - dropout_prob - 0.3
08/30/2022 21:51:23 - INFO - __main__ - gradient_accumulation_steps - 1
08/30/2022 21:51:23 - INFO - __main__ - weight_decay - 0.01
08/30/2022 21:51:23 - INFO - __main__ - adam_beta1 - 0.9
08/30/2022 21:51:23 - INFO - __main__ - adam_beta2 - 0.98
08/30/2022 21:51:23 - INFO - __main__ - adam_epsilon - 1e-06
08/30/2022 21:51:23 - INFO - __main__ - num_train_epochs - 129.0
08/30/2022 21:51:23 - INFO - __main__ - warmup_steps - 5600
08/30/2022 21:51:23 - INFO - __main__ - logging_steps - 500
08/30/2022 21:51:23 - INFO - __main__ - eval_steps - 1000
08/30/2022 21:51:23 - INFO - __main__ - save_steps - 3000
08/30/2022 21:51:23 - INFO - __main__ - no_cuda - False
08/30/2022 21:51:23 - INFO - __main__ - overwrite_output_dir - False
08/30/2022 21:51:23 - INFO - __main__ - seed - 42
08/30/2022 21:51:23 - INFO - __main__ - local_rank - -1
08/30/2022 21:51:23 - INFO - __main__ - amp - True
08/30/2022 21:51:23 - INFO - __main__ - fp16_opt_level - O1
08/30/2022 21:51:23 - INFO - __main__ - max_span_length - 30
08/30/2022 21:51:23 - INFO - __main__ - top_lambda - 0.4
08/30/2022 21:51:23 - INFO - __main__ - max_total_seq_len - 5000
08/30/2022 21:51:23 - INFO - __main__ - experiment_name - eval_model
08/30/2022 21:51:23 - INFO - __main__ - normalise_loss - True
08/30/2022 21:51:23 - INFO - __main__ - ffnn_size - 3072
08/30/2022 21:51:23 - INFO - __main__ - save_if_best - True
08/30/2022 21:51:23 - INFO - __main__ - batch_size_1 - False
08/30/2022 21:51:23 - INFO - __main__ - tensorboard_dir - output/tb
08/30/2022 21:51:23 - INFO - __main__ - conll_path_for_eval - /home/tanner/conll/conll-2012/english/dev.english.v4_gold_conll
08/30/2022 21:51:23 - INFO - __main__ - n_gpu - 0
08/30/2022 21:51:23 - INFO - __main__ - device - cpu
Writing output/meta.json
08/30/2022 21:51:23 - INFO - __main__ - Process rank: -1, device: cpu, n_gpu: 0, distributed training: False, amp training: True
08/30/2022 21:51:34 - INFO - __main__ - Training/evaluation parameters Namespace(adam_beta1=0.9, adam_beta2=0.98, adam_epsilon=1e-06, amp=True, batch_size_1=False, cache_dir='cache', config_name='allenai/longformer-large-4096', conll_path_for_eval='/home/tanner/conll/conll-2012/english/dev.english.v4_gold_conll', device=device(type='cpu'), do_eval=True, do_lower_case=False, do_train=False, dropout_prob=0.3, eval_steps=1000, experiment_name='eval_model', ffnn_size=3072, fp16_opt_level='O1', gradient_accumulation_steps=1, head_learning_rate=0.0003, learning_rate=1e-05, local_rank=-1, logging_steps=500, max_seq_length=4096, max_span_length=30, max_total_seq_len=5000, model_name_or_path='/home/tanner/git/s2e-coref/model', model_type='longformer', n_gpu=0, no_cuda=False, nonfreeze_params=None, normalise_loss=True, num_train_epochs=129.0, output_dir='output', overwrite_output_dir=False, predict_file='/home/tanner/conll/conll-2012/english/test.english.jsonlines', predict_file_cache='/home/tanner/conll/conll-2012/english/test.english.4096.pkl', save_if_best=True, save_steps=3000, seed=42, tensorboard_dir='output/tb', tokenizer_name='allenai/longformer-large-4096', top_lambda=0.4, train_file='/home/tanner/conll/conll-2012/english/train.english.jsonlines', train_file_cache='/home/tanner/conll/conll-2012/english/train.english.4096.pkl', warmup_steps=5600, weight_decay=0.01)
08/30/2022 21:51:34 - INFO - data - Reading dataset from /home/tanner/conll/conll-2012/english/test.english.jsonlines
08/30/2022 21:51:47 - INFO - data - Finished preprocessing Coref dataset. 348 examples were extracted, 0 were filtered due to sequence length.
/home/tanner/git/s2e-coref/env-3.8/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:1767: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
warnings.warn(
08/30/2022 21:51:47 - INFO - eval - ***** Running evaluation final_evaluation *****
08/30/2022 21:51:47 - INFO - eval - Examples number: 348
08/30/2022 22:00:59 - INFO - eval - ***** Eval results final_evaluation *****
08/30/2022 22:00:59 - INFO - eval - loss = 0.421
08/30/2022 22:00:59 - INFO - eval - post pruning mention precision = 0.248
08/30/2022 22:00:59 - INFO - eval - post pruning mention recall = 0.961
08/30/2022 22:00:59 - INFO - eval - post pruning mention f1 = 0.395
08/30/2022 22:00:59 - INFO - eval - mention precision = 0.893
08/30/2022 22:00:59 - INFO - eval - mention recall = 0.878
08/30/2022 22:00:59 - INFO - eval - mention f1 = 0.886
08/30/2022 22:00:59 - INFO - eval - precision = 0.812
08/30/2022 22:00:59 - INFO - eval - recall = 0.795
08/30/2022 22:00:59 - INFO - eval - f1 = 0.804
and an error:
Traceback (most recent call last):
File "run_coref.py", line 155, in <module>
main()
File "run_coref.py", line 147, in main
result = evaluator.evaluate(model, prefix="final_evaluation", official=True)
File "/home/tanner/git/s2e-coref/eval.py", line 138, in evaluate
conll_results = evaluate_conll(self.args.conll_path_for_eval, doc_to_prediction, doc_to_subtoken_map)
File "/home/tanner/git/s2e-coref/conll.py", line 98, in evaluate_conll
output_conll(gold_file, prediction_file, predictions, subtoken_maps)
File "/home/tanner/git/s2e-coref/conll.py", line 47, in output_conll
start_map, end_map, word_map = prediction_map[doc_key]
KeyError: 'bc/cctv/00/cctv_0000_0'
I am looking into this further and post back my solution if I find one. In the meanwhile, I wanted to check if @zyt888 (or @yuvalkirstain or @oriram) know what the problem seems to be.