zeshel
zeshel copied to clipboard
Invalid argument: Key: segment_ids. Can't parse serialized Example.
When trying use run_classifier.sh I get the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Key: segment_ids. Can't parse serialized Example. [[{{node ParseSingleExample/ParseSingleExample}}]] [[IteratorGetNext]] (1) Invalid argument: Key: segment_ids. Can't parse serialized Example. [[{{node ParseSingleExample/ParseSingleExample}}]] [[IteratorGetNext]] [[IteratorGetNext/_4055]]
I'm using reducing the sequence length and the batch size to attempt to fit into the 12GB of memory on the GPU I'm using with the following parameters:
python run_classifier.py
--do_train=true
--do_eval=false
--data_dir=$TFRecords
--vocab_file=$BERT_BASE_DIR/vocab.txt
--bert_config_file=$BERT_BASE_DIR/bert_config.json
--init_checkpoint=$INIT
--max_seq_length=128
--train_batch_size=4
--learning_rate=2e-5
--num_train_epochs=3.0
--num_cands=64
--save_checkpoints_steps=6000
--output_dir=$EXPTS_DIR/$EXP_NAME
--use_tpu=$USE_TPU \
I have the same problem as @havocy28
Looking forward to your early reply.
Thanks, Best.
I was able to overcome this error by reducing the maximum sequence length to 64 and making the following changes in the create_training_data.py file:
Modifying the mentions that exceed the maximum sequence length to be limited to the maximum sequence length and to only extend the prefix added to the mention if it is less than the maximum sequence length and only adding the suffix to the mention if there is room left over. However, in reducing the sequence length to 64 instead of 256, it still only runs a batch size of 1 as opposed to 8 on 12GB of Memory. I've attached the changes I made into this post. create_training_data.txt
Note: You also have to delete the existing tfrecords before running the create_training_data.sh again.
Hi @havocy28 ,
Thanks for your share. Best.
It's probably worth mentioning that the performance was terrible in my evaluation:
I1026 08:16:01.098551 140515070199552 run_classifier.py:440] ***** Eval results ***** I1026 08:16:01.098709 140515070199552 run_classifier.py:442] eval_accuracy = 0.057 I1026 08:16:01.099158 140515070199552 run_classifier.py:442] eval_loss = 4.1595993 I1026 08:16:01.099335 140515070199552 run_classifier.py:442] global_step = 0 I1026 08:16:01.099476 140515070199552 run_classifier.py:442] loss = 4.1595993
I did not adjust the learning rate and there may be bugs in the modifications I made. If anyone finds reduced parameters that work, please share.
Hi @havocy28 , Can you share your run_classifier.py on the GPU? Thanks. Best.
Hi @havocy28 , if i want to train this on the GPU, can you share your run_classifier.py? Thanks. Best.