deformer icon indicating copy to clipboard operation
deformer copied to clipboard

error on training on GPU?

Open winston52 opened this issue 3 years ago • 0 comments

Hello,

I followed the instruction until training bert and ebert on qqp dataset. I trained them on GPU but did not get a reasonable result

I trained them with the commond:

python train.py -m bert -t qqp 2>&1 | tee data/qqp-bert-train.log python train.py -m ebert -t qqp 2>&1 | tee data/qqp-ebert-train.log


and the train log file on bert are :

_WARNING:tensorflow:From /data1/hwt/deformer/common/optimizer.py:91: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

INFO:2020-12-11_18:12:51.913:/data1/hwt/deformer/common/config.py:130: config_file: /data1/hwt/deformer/config/bert_classifier.ini INFO:2020-12-11_18:12:51.915:/data1/hwt/deformer/common/config.py:79: [1m[34mtask set to env qqp instead of provided [0m INFO:2020-12-11_18:12:51.916:/data1/hwt/deformer/common/config.py:79: [1m[34mmode set to env train instead of provided train[0m INFO:2020-12-11_18:12:51.917:/data1/hwt/deformer/common/config.py:96: (train) dataset_file: /data1/hwt/deformer/data/datasets/converted/bert/qqp-train.327464.tfrecord WARNING:tensorflow:From train.py:18: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

INFO:2020-12-11_18:12:51.919:train.py:28: config: attention_dropout_prob: 0.1 attention_head_size: 64 bfloat16: checkpoint_dir: /data1/hwt/deformer/data/ckpt/bert-base/qqp data_dir: /data1/hwt/deformer/data dataset_file: /data1/hwt/deformer/data/datasets/converted/bert/qqp-train.327464.tfrecord dataset_size: 327464 debug: False dev_batch_size: 16 epochs: 3 ground_truth_file: /data1/hwt/deformer/data/datasets/converted/bert/qqp-dev.*.jsonl hidden_dropout_prob: 0.1 hidden_size: 768 inference_graph: /data1/hwt/deformer/data/ckpt/bert/qqp_bert_infer.pb init_checkpoint: /data1/hwt/deformer/data/ckpt/init/uncased_base/bert_model.ckpt initializer_range: 0.02 input_buffer_size: 2000 input_num_threads: 8 intermediate_act_fn: gelu intermediate_size: 3072 iterations_per_loop: 1000 keep_checkpoint_max: 20 learning_rate: 5e-05 lower_case: True max_first_length: 40 max_position_embeddings: 512 max_seq_length: 100 mode: train model: bert num_choices: 0 num_classes: 2 num_heads: 12 num_hidden_layers: 12 num_tpu_cores: 8 num_train_steps: 30699 num_warmup_steps: 4604 optimize_padding: False output_file: /data1/hwt/deformer/data/predictions/bert/qqp-dev-predictions.json print_steps: 100 random_seed: 0 steps_per_checkpoint: 1000 task: qqp tpu_name: train_batch_size: 32 type_vocab_size: 2 use_host_call: True use_replace_map: True use_tpu: False vocab_file: /data1/hwt/deformer/data/res/bert.vocab vocab_size: 30522 warmup_ratio: 0.15 The current process just got forked. Disabling parallelism to avoid deadlocks... To disable this warning, please explicitly set TOKENIZERS_PARALLELISM=(true | false) WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

  • https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  • https://github.com/tensorflow/addons
  • https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /data1/hwt/deformer/common/tf_util.py:116: The name tf.keras.initializers.TruncatedNormal is deprecated. Please use tf.compat.v1.keras.initializers.TruncatedNormal instead.

WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/keras/initializers.py:94: calling TruncatedNormal.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder..model_fn at 0x7f886427d268>) includes params argument, but params are not passed to Estimator. WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False. INFO:2020-12-11_18:12:52.653:train.py:33: begin training for 30699 steps.... WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/data/util/random_seed.py:58: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where INFO:2020-12-11_18:12:53.027:/data1/hwt/deformer/common/builder.py:48: *** Features *** INFO:2020-12-11_18:12:53.027:/data1/hwt/deformer/common/builder.py:50: name=feature_id, shape=(32,) INFO:2020-12-11_18:12:53.027:/data1/hwt/deformer/common/builder.py:50: name=input_ids, shape=(32, 100) INFO:2020-12-11_18:12:53.027:/data1/hwt/deformer/common/builder.py:50: name=segment_ids, shape=(32, 100) WARNING:tensorflow:From /data1/hwt/deformer/common/builder.py:63: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

WARNING:tensorflow:From /data1/hwt/deformer/common/builder.py:107: The name tf.train.init_from_checkpoint is deprecated. Please use tf.compat.v1.train.init_from_checkpoint instead.

INFO:2020-12-11_18:12:56.598:/data1/hwt/deformer/common/builder.py:109: **** Initialized Variables **** INFO:2020-12-11_18:12:56.598:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/word_embeddings:0, shape=(30522, 768) INFO:2020-12-11_18:12:56.598:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/token_type_embeddings:0, shape=(2, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/position_embeddings:0, shape=(512, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.599:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.600:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.601:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.602:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.603:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/pooler/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/pooler/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.604:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/classifier/dense/kernel:0, shape=(768, 2) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/classifier/dense/bias:0, shape=(2,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:117: **** Trainable Variables **** INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/embeddings/word_embeddings:0, shape=(30522, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/embeddings/token_type_embeddings:0, shape=(2, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/embeddings/position_embeddings:0, shape=(512, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/embeddings/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/embeddings/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_0/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.605:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_1/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_2/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.606:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_3/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_4/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.607:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_5/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_6/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.608:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_7/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.609:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_8/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_9/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.610:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_10/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/output/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/encoder/layer_11/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/pooler/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/bert/pooler/dense/bias:0, shape=(768,) INFO:2020-12-11_18:12:56.611:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/classifier/dense/kernel:0, shape=(768, 2) INFO:2020-12-11_18:12:56.612:/data1/hwt/deformer/common/builder.py:123: name=bert_classifier/classifier/dense/bias:0, shape=(2,) WARNING:tensorflow:From /data1/hwt/deformer/common/optimizer.py:27: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

WARNING:tensorflow:From /data1/hwt/deformer/common/optimizer.py:32: The name tf.train.polynomial_decay is deprecated. Please use tf.compat.v1.train.polynomial_decay instead.

WARNING:tensorflow:From /data1/hwt/deformer/common/optimizer.py:133: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From /data1/hwt/deformer/common/builder.py:195: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING:tensorflow:From /data1/hwt/deformer/tasks/classifier.py:78: The name tf.metrics.accuracy is deprecated. Please use tf.compat.v1.metrics.accuracy instead.

WARNING:tensorflow:From /data1/hwt/deformer/tasks/classifier.py:87: The name tf.metrics.mean is deprecated. Please use tf.compat.v1.metrics.mean instead.

WARNING:tensorflow:From /data1/hwt/deformer/common/builder.py:199: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

2020-12-11 18:13:07.068641: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2020-12-11 18:13:07.103862: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200000000 Hz 2020-12-11 18:13:07.107824: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b5889265d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-12-11 18:13:07.107875: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-12-11 18:13:07.112465: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2020-12-11 18:13:07.369551: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b5913c0bf0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-12-11 18:13:07.369605: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-12-11 18:13:07.369620: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (1): TITAN RTX, Compute Capability 7.5 2020-12-11 18:13:07.373568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:1a:00.0 2020-12-11 18:13:07.376250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 1 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:89:00.0 2020-12-11 18:13:07.376647: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2020-12-11 18:13:07.378811: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 2020-12-11 18:13:07.380499: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 2020-12-11 18:13:07.380788: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 2020-12-11 18:13:07.382088: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 2020-12-11 18:13:07.383096: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 2020-12-11 18:13:07.386400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-12-11 18:13:07.391995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0, 1 2020-12-11 18:13:07.392043: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2020-12-11 18:13:07.395613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-12-11 18:13:07.395629: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 1 2020-12-11 18:13:07.395636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N N 2020-12-11 18:13:07.395641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 1: N N 2020-12-11 18:13:07.400003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22080 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:1a:00.0, compute capability: 7.5) 2020-12-11 18:13:07.401819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 16707 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:89:00.0, compute capability: 7.5) 2020-12-11 18:13:31.147009: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/saver.py:963: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to delete files with this prefix. INFO:2020-12-11_20:33:44.190:train.py:38: training ended! INFO:2020-12-11_20:33:44.191:train.py:39: all done, took 2:20:52.271457 s!_


and the eval log file on bert are :

_WARNING:tensorflow:From /data1/hwt/deformer/common/optimizer.py:91: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

INFO:2020-12-11_20:43:58.792:/data1/hwt/deformer/common/config.py:130: config_file: /data1/hwt/deformer/config/bert_classifier.ini INFO:2020-12-11_20:43:58.793:/data1/hwt/deformer/common/config.py:79: [1m[34mtask set to env qqp instead of provided [0m INFO:2020-12-11_20:43:58.794:/data1/hwt/deformer/common/config.py:79: [1m[34mmode set to env dev instead of provided train[0m INFO:2020-12-11_20:43:58.795:/data1/hwt/deformer/common/config.py:96: (dev) dataset_file: /data1/hwt/deformer/data/datasets/converted/bert/qqp-dev.40430.tfrecord WARNING:tensorflow:From eval.py:24: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

INFO:2020-12-11_20:43:58.797:eval.py:31: config: attention_dropout_prob: 0.1 attention_head_size: 64 bfloat16: checkpoint_dir: /data1/hwt/deformer/data/ckpt/bert-base/qqp checkpoint_path: None data_dir: /data1/hwt/deformer/data dataset_file: /data1/hwt/deformer/data/datasets/converted/bert/qqp-dev.40430.tfrecord dataset_size: 40430 debug: False dev_batch_size: 16 epochs: 3 ground_truth_file: /data1/hwt/deformer/data/datasets/converted/bert/qqp-dev.40430.jsonl hidden_dropout_prob: 0.1 hidden_size: 768 inference_graph: /data1/hwt/deformer/data/ckpt/bert/qqp_bert_infer.pb init_checkpoint: /data1/hwt/deformer/data/ckpt/init/uncased_base/bert_model.ckpt initializer_range: 0.02 input_buffer_size: 2000 input_num_threads: 8 intermediate_act_fn: gelu intermediate_size: 3072 iterate_checkpoints: False iterate_timeout: 3600 iterations_per_loop: 1000 keep_checkpoint_max: 20 learning_rate: 5e-05 lower_case: True max_first_length: 40 max_position_embeddings: 512 max_seq_length: 100 mode: dev model: bert num_choices: 0 num_classes: 2 num_heads: 12 num_hidden_layers: 12 num_tpu_cores: 8 optimize_padding: False output_file: /data1/hwt/deformer/data/predictions/bert/qqp-dev-predictions.json print_steps: 100 random_seed: 0 steps_per_checkpoint: 1000 task: qqp tpu_name: train_batch_size: 32 type_vocab_size: 2 use_host_call: True use_replace_map: True use_tpu: False vocab_file: /data1/hwt/deformer/data/res/bert.vocab vocab_size: 30522 warmup_ratio: 0.15 The current process just got forked. Disabling parallelism to avoid deadlocks... To disable this warning, please explicitly set TOKENIZERS_PARALLELISM=(true | false) WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

  • https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  • https://github.com/tensorflow/addons
  • https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /data1/hwt/deformer/common/tf_util.py:116: The name tf.keras.initializers.TruncatedNormal is deprecated. Please use tf.compat.v1.keras.initializers.TruncatedNormal instead.

WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/keras/initializers.py:94: calling TruncatedNormal.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder..model_fn at 0x7f7c59839268>) includes params argument, but params are not passed to Estimator. WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False. INFO:2020-12-11_20:43:59.573:eval.py:42: loading examples from /data1/hwt/deformer/data/datasets/converted/bert/qqp-dev.40430.jsonl.... INFO:2020-12-11_20:44:01.166:eval.py:48: begin evaluating /data1/hwt/deformer/data/ckpt/bert-base/qqp/model.ckpt-30699... WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. INFO:2020-12-11_20:44:01.598:/data1/hwt/deformer/common/builder.py:48: *** Features *** INFO:2020-12-11_20:44:01.598:/data1/hwt/deformer/common/builder.py:50: name=feature_id, shape=(?,) INFO:2020-12-11_20:44:01.598:/data1/hwt/deformer/common/builder.py:50: name=input_ids, shape=(?, 100) INFO:2020-12-11_20:44:01.598:/data1/hwt/deformer/common/builder.py:50: name=segment_ids, shape=(?, 100) WARNING:tensorflow:From /data1/hwt/deformer/common/builder.py:63: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

WARNING:tensorflow:From /data1/hwt/deformer/common/builder.py:107: The name tf.train.init_from_checkpoint is deprecated. Please use tf.compat.v1.train.init_from_checkpoint instead.

INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:109: **** Initialized Variables **** INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/word_embeddings:0, shape=(30522, 768) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/token_type_embeddings:0, shape=(2, 768) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/position_embeddings:0, shape=(512, 768) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/embeddings/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.059:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_0/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.060:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_1/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.061:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_2/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_3/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.062:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_4/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.063:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_5/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.064:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_6/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_7/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.065:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_8/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.066:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_9/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_10/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.067:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/query/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/query/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/key/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/key/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/value/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/self/value/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/attention/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/intermediate/dense/kernel:0, shape=(768, 3072) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/intermediate/dense/bias:0, shape=(3072,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/dense/kernel:0, shape=(3072, 768) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/layer_norm/gamma:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/encoder/layer_11/output/layer_norm/beta:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/pooler/dense/kernel:0, shape=(768, 768) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/bert/pooler/dense/bias:0, shape=(768,) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/classifier/dense/kernel:0, shape=(768, 2) INFO:2020-12-11_20:44:06.068:/data1/hwt/deformer/common/builder.py:114: name=bert_classifier/classifier/dense/bias:0, shape=(2,) WARNING:tensorflow:From /home/hwt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where 2020-12-11 20:44:06.420797: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2020-12-11 20:44:06.456048: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200000000 Hz 2020-12-11 20:44:06.460253: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aeaaefa890 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-12-11 20:44:06.460297: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-12-11 20:44:06.465020: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2020-12-11 20:44:06.747001: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aeaade4fd0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-12-11 20:44:06.747060: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-12-11 20:44:06.747075: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (1): TITAN RTX, Compute Capability 7.5 2020-12-11 20:44:06.750935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:1a:00.0 2020-12-11 20:44:06.751883: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 1 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:89:00.0 2020-12-11 20:44:06.752308: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2020-12-11 20:44:06.754967: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 2020-12-11 20:44:06.757226: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0 2020-12-11 20:44:06.757793: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0 2020-12-11 20:44:06.760374: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0 2020-12-11 20:44:06.761648: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0 2020-12-11 20:44:06.765756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-12-11 20:44:06.769820: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0, 1 2020-12-11 20:44:06.769868: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0 2020-12-11 20:44:06.772734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-12-11 20:44:06.772752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 1 2020-12-11 20:44:06.772758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N N 2020-12-11 20:44:06.772763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 1: N N 2020-12-11 20:44:06.776361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22080 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:1a:00.0, compute capability: 7.5) 2020-12-11 20:44:06.777403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 770 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:89:00.0, compute capability: 7.5) 2020-12-11 20:44:09.491162: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 INFO:2020-12-11_20:44:10.350:eval.py:120: model.ckpt-30699, predicted 10/(2526) batches INFO:2020-12-11_20:44:10.925:eval.py:120: model.ckpt-30699, predicted 20/(2526) batches INFO:2020-12-11_20:44:11.521:eval.py:120: model.ckpt-30699, predicted 30/(2526) batches INFO:2020-12-11_20:44:12.126:eval.py:120: model.ckpt-30699, predicted 40/(2526) batches INFO:2020-12-11_20:44:12.719:eval.py:120: model.ckpt-30699, predicted 50/(2526) batches INFO:2020-12-11_20:44:13.327:eval.py:120: model.ckpt-30699, predicted 60/(2526) batches INFO:2020-12-11_20:44:13.929:eval.py:120: model.ckpt-30699, predicted 70/(2526) batches INFO:2020-12-11_20:44:14.522:eval.py:120: model.ckpt-30699, predicted 80/(2526) batches INFO:2020-12-11_20:44:15.108:eval.py:120: model.ckpt-30699, predicted 90/(2526) batches INFO:2020-12-11_20:44:15.726:eval.py:120: model.ckpt-30699, predicted 100/(2526) batches INFO:2020-12-11_20:44:16.330:eval.py:120: model.ckpt-30699, predicted 110/(2526) batches INFO:2020-12-11_20:44:16.928:eval.py:120: model.ckpt-30699, predicted 120/(2526) batches INFO:2020-12-11_20:44:17.530:eval.py:120: model.ckpt-30699, predicted 130/(2526) batches INFO:2020-12-11_20:44:18.134:eval.py:120: model.ckpt-30699, predicted 140/(2526) batches INFO:2020-12-11_20:44:18.719:eval.py:120: model.ckpt-30699, predicted 150/(2526) batches INFO:2020-12-11_20:44:19.299:eval.py:120: model.ckpt-30699, predicted 160/(2526) batches INFO:2020-12-11_20:44:19.889:eval.py:120: model.ckpt-30699, predicted 170/(2526) batches INFO:2020-12-11_20:44:20.493:eval.py:120: model.ckpt-30699, predicted 180/(2526) batches INFO:2020-12-11_20:44:21.092:eval.py:120: model.ckpt-30699, predicted 190/(2526) batches INFO:2020-12-11_20:44:21.688:eval.py:120: model.ckpt-30699, predicted 200/(2526) batches INFO:2020-12-11_20:44:22.284:eval.py:120: model.ckpt-30699, predicted 210/(2526) batches INFO:2020-12-11_20:44:22.882:eval.py:120: model.ckpt-30699, predicted 220/(2526) batches INFO:2020-12-11_20:44:23.475:eval.py:120: model.ckpt-30699, predicted 230/(2526) batches INFO:2020-12-11_20:44:24.088:eval.py:120: model.ckpt-30699, predicted 240/(2526) batches INFO:2020-12-11_20:44:24.691:eval.py:120: model.ckpt-30699, predicted 250/(2526) batches INFO:2020-12-11_20:44:25.302:eval.py:120: model.ckpt-30699, predicted 260/(2526) batches INFO:2020-12-11_20:44:25.918:eval.py:120: model.ckpt-30699, predicted 270/(2526) batches INFO:2020-12-11_20:44:26.538:eval.py:120: model.ckpt-30699, predicted 280/(2526) batches INFO:2020-12-11_20:44:27.131:eval.py:120: model.ckpt-30699, predicted 290/(2526) batches INFO:2020-12-11_20:44:27.741:eval.py:120: model.ckpt-30699, predicted 300/(2526) batches INFO:2020-12-11_20:44:28.351:eval.py:120: model.ckpt-30699, predicted 310/(2526) batches INFO:2020-12-11_20:44:28.955:eval.py:120: model.ckpt-30699, predicted 320/(2526) batches INFO:2020-12-11_20:44:29.551:eval.py:120: model.ckpt-30699, predicted 330/(2526) batches INFO:2020-12-11_20:44:30.140:eval.py:120: model.ckpt-30699, predicted 340/(2526) batches INFO:2020-12-11_20:44:30.726:eval.py:120: model.ckpt-30699, predicted 350/(2526) batches INFO:2020-12-11_20:44:31.306:eval.py:120: model.ckpt-30699, predicted 360/(2526) batches INFO:2020-12-11_20:44:31.890:eval.py:120: model.ckpt-30699, predicted 370/(2526) batches INFO:2020-12-11_20:44:32.479:eval.py:120: model.ckpt-30699, predicted 380/(2526) batches INFO:2020-12-11_20:44:33.098:eval.py:120: model.ckpt-30699, predicted 390/(2526) batches INFO:2020-12-11_20:44:33.694:eval.py:120: model.ckpt-30699, predicted 400/(2526) batches INFO:2020-12-11_20:44:34.298:eval.py:120: model.ckpt-30699, predicted 410/(2526) batches INFO:2020-12-11_20:44:34.890:eval.py:120: model.ckpt-30699, predicted 420/(2526) batches INFO:2020-12-11_20:44:35.494:eval.py:120: model.ckpt-30699, predicted 430/(2526) batches INFO:2020-12-11_20:44:36.100:eval.py:120: model.ckpt-30699, predicted 440/(2526) batches INFO:2020-12-11_20:44:36.703:eval.py:120: model.ckpt-30699, predicted 450/(2526) batches INFO:2020-12-11_20:44:37.313:eval.py:120: model.ckpt-30699, predicted 460/(2526) batches INFO:2020-12-11_20:44:37.914:eval.py:120: model.ckpt-30699, predicted 470/(2526) batches INFO:2020-12-11_20:44:38.511:eval.py:120: model.ckpt-30699, predicted 480/(2526) batches INFO:2020-12-11_20:44:39.094:eval.py:120: model.ckpt-30699, predicted 490/(2526) batches INFO:2020-12-11_20:44:39.708:eval.py:120: model.ckpt-30699, predicted 500/(2526) batches INFO:2020-12-11_20:44:40.341:eval.py:120: model.ckpt-30699, predicted 510/(2526) batches INFO:2020-12-11_20:44:40.957:eval.py:120: model.ckpt-30699, predicted 520/(2526) batches INFO:2020-12-11_20:44:41.545:eval.py:120: model.ckpt-30699, predicted 530/(2526) batches INFO:2020-12-11_20:44:42.157:eval.py:120: model.ckpt-30699, predicted 540/(2526) batches INFO:2020-12-11_20:44:42.758:eval.py:120: model.ckpt-30699, predicted 550/(2526) batches INFO:2020-12-11_20:44:43.360:eval.py:120: model.ckpt-30699, predicted 560/(2526) batches INFO:2020-12-11_20:44:43.977:eval.py:120: model.ckpt-30699, predicted 570/(2526) batches INFO:2020-12-11_20:44:44.570:eval.py:120: model.ckpt-30699, predicted 580/(2526) batches INFO:2020-12-11_20:44:45.174:eval.py:120: model.ckpt-30699, predicted 590/(2526) batches INFO:2020-12-11_20:44:45.778:eval.py:120: model.ckpt-30699, predicted 600/(2526) batches INFO:2020-12-11_20:44:46.385:eval.py:120: model.ckpt-30699, predicted 610/(2526) batches INFO:2020-12-11_20:44:46.971:eval.py:120: model.ckpt-30699, predicted 620/(2526) batches INFO:2020-12-11_20:44:47.580:eval.py:120: model.ckpt-30699, predicted 630/(2526) batches INFO:2020-12-11_20:44:48.172:eval.py:120: model.ckpt-30699, predicted 640/(2526) batches INFO:2020-12-11_20:44:48.790:eval.py:120: model.ckpt-30699, predicted 650/(2526) batches INFO:2020-12-11_20:44:49.401:eval.py:120: model.ckpt-30699, predicted 660/(2526) batches INFO:2020-12-11_20:44:50.017:eval.py:120: model.ckpt-30699, predicted 670/(2526) batches INFO:2020-12-11_20:44:50.619:eval.py:120: model.ckpt-30699, predicted 680/(2526) batches INFO:2020-12-11_20:44:51.211:eval.py:120: model.ckpt-30699, predicted 690/(2526) batches INFO:2020-12-11_20:44:51.817:eval.py:120: model.ckpt-30699, predicted 700/(2526) batches INFO:2020-12-11_20:44:52.435:eval.py:120: model.ckpt-30699, predicted 710/(2526) batches INFO:2020-12-11_20:44:53.038:eval.py:120: model.ckpt-30699, predicted 720/(2526) batches INFO:2020-12-11_20:44:53.642:eval.py:120: model.ckpt-30699, predicted 730/(2526) batches INFO:2020-12-11_20:44:54.238:eval.py:120: model.ckpt-30699, predicted 740/(2526) batches INFO:2020-12-11_20:44:54.830:eval.py:120: model.ckpt-30699, predicted 750/(2526) batches INFO:2020-12-11_20:44:55.416:eval.py:120: model.ckpt-30699, predicted 760/(2526) batches INFO:2020-12-11_20:44:56.007:eval.py:120: model.ckpt-30699, predicted 770/(2526) batches INFO:2020-12-11_20:44:56.604:eval.py:120: model.ckpt-30699, predicted 780/(2526) batches INFO:2020-12-11_20:44:57.218:eval.py:120: model.ckpt-30699, predicted 790/(2526) batches INFO:2020-12-11_20:44:57.843:eval.py:120: model.ckpt-30699, predicted 800/(2526) batches INFO:2020-12-11_20:44:58.454:eval.py:120: model.ckpt-30699, predicted 810/(2526) batches INFO:2020-12-11_20:44:59.046:eval.py:120: model.ckpt-30699, predicted 820/(2526) batches INFO:2020-12-11_20:44:59.658:eval.py:120: model.ckpt-30699, predicted 830/(2526) batches INFO:2020-12-11_20:45:00.265:eval.py:120: model.ckpt-30699, predicted 840/(2526) batches INFO:2020-12-11_20:45:00.885:eval.py:120: model.ckpt-30699, predicted 850/(2526) batches INFO:2020-12-11_20:45:01.504:eval.py:120: model.ckpt-30699, predicted 860/(2526) batches INFO:2020-12-11_20:45:02.110:eval.py:120: model.ckpt-30699, predicted 870/(2526) batches INFO:2020-12-11_20:45:02.698:eval.py:120: model.ckpt-30699, predicted 880/(2526) batches INFO:2020-12-11_20:45:03.296:eval.py:120: model.ckpt-30699, predicted 890/(2526) batches INFO:2020-12-11_20:45:03.904:eval.py:120: model.ckpt-30699, predicted 900/(2526) batches INFO:2020-12-11_20:45:04.503:eval.py:120: model.ckpt-30699, predicted 910/(2526) batches INFO:2020-12-11_20:45:05.130:eval.py:120: model.ckpt-30699, predicted 920/(2526) batches INFO:2020-12-11_20:45:05.749:eval.py:120: model.ckpt-30699, predicted 930/(2526) batches INFO:2020-12-11_20:45:06.359:eval.py:120: model.ckpt-30699, predicted 940/(2526) batches INFO:2020-12-11_20:45:06.956:eval.py:120: model.ckpt-30699, predicted 950/(2526) batches INFO:2020-12-11_20:45:07.560:eval.py:120: model.ckpt-30699, predicted 960/(2526) batches INFO:2020-12-11_20:45:08.182:eval.py:120: model.ckpt-30699, predicted 970/(2526) batches INFO:2020-12-11_20:45:08.793:eval.py:120: model.ckpt-30699, predicted 980/(2526) batches INFO:2020-12-11_20:45:09.395:eval.py:120: model.ckpt-30699, predicted 990/(2526) batches INFO:2020-12-11_20:45:09.973:eval.py:120: model.ckpt-30699, predicted 1000/(2526) batches INFO:2020-12-11_20:45:10.566:eval.py:120: model.ckpt-30699, predicted 1010/(2526) batches INFO:2020-12-11_20:45:11.157:eval.py:120: model.ckpt-30699, predicted 1020/(2526) batches INFO:2020-12-11_20:45:11.762:eval.py:120: model.ckpt-30699, predicted 1030/(2526) batches INFO:2020-12-11_20:45:12.364:eval.py:120: model.ckpt-30699, predicted 1040/(2526) batches INFO:2020-12-11_20:45:12.974:eval.py:120: model.ckpt-30699, predicted 1050/(2526) batches INFO:2020-12-11_20:45:13.591:eval.py:120: model.ckpt-30699, predicted 1060/(2526) batches INFO:2020-12-11_20:45:14.194:eval.py:120: model.ckpt-30699, predicted 1070/(2526) batches INFO:2020-12-11_20:45:14.788:eval.py:120: model.ckpt-30699, predicted 1080/(2526) batches INFO:2020-12-11_20:45:15.386:eval.py:120: model.ckpt-30699, predicted 1090/(2526) batches INFO:2020-12-11_20:45:15.989:eval.py:120: model.ckpt-30699, predicted 1100/(2526) batches INFO:2020-12-11_20:45:16.602:eval.py:120: model.ckpt-30699, predicted 1110/(2526) batches INFO:2020-12-11_20:45:17.214:eval.py:120: model.ckpt-30699, predicted 1120/(2526) batches INFO:2020-12-11_20:45:17.825:eval.py:120: model.ckpt-30699, predicted 1130/(2526) batches INFO:2020-12-11_20:45:18.422:eval.py:120: model.ckpt-30699, predicted 1140/(2526) batches INFO:2020-12-11_20:45:19.006:eval.py:120: model.ckpt-30699, predicted 1150/(2526) batches INFO:2020-12-11_20:45:19.609:eval.py:120: model.ckpt-30699, predicted 1160/(2526) batches INFO:2020-12-11_20:45:20.202:eval.py:120: model.ckpt-30699, predicted 1170/(2526) batches INFO:2020-12-11_20:45:20.800:eval.py:120: model.ckpt-30699, predicted 1180/(2526) batches INFO:2020-12-11_20:45:21.411:eval.py:120: model.ckpt-30699, predicted 1190/(2526) batches INFO:2020-12-11_20:45:22.026:eval.py:120: model.ckpt-30699, predicted 1200/(2526) batches INFO:2020-12-11_20:45:22.627:eval.py:120: model.ckpt-30699, predicted 1210/(2526) batches INFO:2020-12-11_20:45:23.226:eval.py:120: model.ckpt-30699, predicted 1220/(2526) batches INFO:2020-12-11_20:45:23.831:eval.py:120: model.ckpt-30699, predicted 1230/(2526) batches INFO:2020-12-11_20:45:24.442:eval.py:120: model.ckpt-30699, predicted 1240/(2526) batches INFO:2020-12-11_20:45:25.055:eval.py:120: model.ckpt-30699, predicted 1250/(2526) batches INFO:2020-12-11_20:45:25.677:eval.py:120: model.ckpt-30699, predicted 1260/(2526) batches INFO:2020-12-11_20:45:26.300:eval.py:120: model.ckpt-30699, predicted 1270/(2526) batches INFO:2020-12-11_20:45:26.893:eval.py:120: model.ckpt-30699, predicted 1280/(2526) batches INFO:2020-12-11_20:45:27.496:eval.py:120: model.ckpt-30699, predicted 1290/(2526) batches INFO:2020-12-11_20:45:28.092:eval.py:120: model.ckpt-30699, predicted 1300/(2526) batches INFO:2020-12-11_20:45:28.701:eval.py:120: model.ckpt-30699, predicted 1310/(2526) batches INFO:2020-12-11_20:45:29.297:eval.py:120: model.ckpt-30699, predicted 1320/(2526) batches INFO:2020-12-11_20:45:29.877:eval.py:120: model.ckpt-30699, predicted 1330/(2526) batches INFO:2020-12-11_20:45:30.483:eval.py:120: model.ckpt-30699, predicted 1340/(2526) batches INFO:2020-12-11_20:45:31.078:eval.py:120: model.ckpt-30699, predicted 1350/(2526) batches INFO:2020-12-11_20:45:31.683:eval.py:120: model.ckpt-30699, predicted 1360/(2526) batches INFO:2020-12-11_20:45:32.278:eval.py:120: model.ckpt-30699, predicted 1370/(2526) batches INFO:2020-12-11_20:45:32.892:eval.py:120: model.ckpt-30699, predicted 1380/(2526) batches INFO:2020-12-11_20:45:33.490:eval.py:120: model.ckpt-30699, predicted 1390/(2526) batches INFO:2020-12-11_20:45:34.090:eval.py:120: model.ckpt-30699, predicted 1400/(2526) batches INFO:2020-12-11_20:45:34.699:eval.py:120: model.ckpt-30699, predicted 1410/(2526) batches INFO:2020-12-11_20:45:35.294:eval.py:120: model.ckpt-30699, predicted 1420/(2526) batches INFO:2020-12-11_20:45:35.903:eval.py:120: model.ckpt-30699, predicted 1430/(2526) batches INFO:2020-12-11_20:45:36.526:eval.py:120: model.ckpt-30699, predicted 1440/(2526) batches INFO:2020-12-11_20:45:37.107:eval.py:120: model.ckpt-30699, predicted 1450/(2526) batches INFO:2020-12-11_20:45:37.692:eval.py:120: model.ckpt-30699, predicted 1460/(2526) batches INFO:2020-12-11_20:45:38.296:eval.py:120: model.ckpt-30699, predicted 1470/(2526) batches INFO:2020-12-11_20:45:38.881:eval.py:120: model.ckpt-30699, predicted 1480/(2526) batches INFO:2020-12-11_20:45:39.479:eval.py:120: model.ckpt-30699, predicted 1490/(2526) batches INFO:2020-12-11_20:45:40.063:eval.py:120: model.ckpt-30699, predicted 1500/(2526) batches INFO:2020-12-11_20:45:40.661:eval.py:120: model.ckpt-30699, predicted 1510/(2526) batches INFO:2020-12-11_20:45:41.274:eval.py:120: model.ckpt-30699, predicted 1520/(2526) batches INFO:2020-12-11_20:45:41.879:eval.py:120: model.ckpt-30699, predicted 1530/(2526) batches INFO:2020-12-11_20:45:42.492:eval.py:120: model.ckpt-30699, predicted 1540/(2526) batches INFO:2020-12-11_20:45:43.084:eval.py:120: model.ckpt-30699, predicted 1550/(2526) batches INFO:2020-12-11_20:45:43.692:eval.py:120: model.ckpt-30699, predicted 1560/(2526) batches INFO:2020-12-11_20:45:44.296:eval.py:120: model.ckpt-30699, predicted 1570/(2526) batches INFO:2020-12-11_20:45:44.889:eval.py:120: model.ckpt-30699, predicted 1580/(2526) batches INFO:2020-12-11_20:45:45.502:eval.py:120: model.ckpt-30699, predicted 1590/(2526) batches INFO:2020-12-11_20:45:46.092:eval.py:120: model.ckpt-30699, predicted 1600/(2526) batches INFO:2020-12-11_20:45:46.689:eval.py:120: model.ckpt-30699, predicted 1610/(2526) batches INFO:2020-12-11_20:45:47.295:eval.py:120: model.ckpt-30699, predicted 1620/(2526) batches INFO:2020-12-11_20:45:47.902:eval.py:120: model.ckpt-30699, predicted 1630/(2526) batches INFO:2020-12-11_20:45:48.513:eval.py:120: model.ckpt-30699, predicted 1640/(2526) batches INFO:2020-12-11_20:45:49.102:eval.py:120: model.ckpt-30699, predicted 1650/(2526) batches INFO:2020-12-11_20:45:49.709:eval.py:120: model.ckpt-30699, predicted 1660/(2526) batches INFO:2020-12-11_20:45:50.308:eval.py:120: model.ckpt-30699, predicted 1670/(2526) batches INFO:2020-12-11_20:45:50.894:eval.py:120: model.ckpt-30699, predicted 1680/(2526) batches INFO:2020-12-11_20:45:51.498:eval.py:120: model.ckpt-30699, predicted 1690/(2526) batches INFO:2020-12-11_20:45:52.112:eval.py:120: model.ckpt-30699, predicted 1700/(2526) batches INFO:2020-12-11_20:45:52.726:eval.py:120: model.ckpt-30699, predicted 1710/(2526) batches INFO:2020-12-11_20:45:53.321:eval.py:120: model.ckpt-30699, predicted 1720/(2526) batches INFO:2020-12-11_20:45:53.919:eval.py:120: model.ckpt-30699, predicted 1730/(2526) batches INFO:2020-12-11_20:45:54.522:eval.py:120: model.ckpt-30699, predicted 1740/(2526) batches INFO:2020-12-11_20:45:55.102:eval.py:120: model.ckpt-30699, predicted 1750/(2526) batches INFO:2020-12-11_20:45:55.704:eval.py:120: model.ckpt-30699, predicted 1760/(2526) batches INFO:2020-12-11_20:45:56.317:eval.py:120: model.ckpt-30699, predicted 1770/(2526) batches INFO:2020-12-11_20:45:56.928:eval.py:120: model.ckpt-30699, predicted 1780/(2526) batches INFO:2020-12-11_20:45:57.529:eval.py:120: model.ckpt-30699, predicted 1790/(2526) batches INFO:2020-12-11_20:45:58.132:eval.py:120: model.ckpt-30699, predicted 1800/(2526) batches INFO:2020-12-11_20:45:58.727:eval.py:120: model.ckpt-30699, predicted 1810/(2526) batches INFO:2020-12-11_20:45:59.324:eval.py:120: model.ckpt-30699, predicted 1820/(2526) batches INFO:2020-12-11_20:45:59.916:eval.py:120: model.ckpt-30699, predicted 1830/(2526) batches INFO:2020-12-11_20:46:00.508:eval.py:120: model.ckpt-30699, predicted 1840/(2526) batches INFO:2020-12-11_20:46:01.106:eval.py:120: model.ckpt-30699, predicted 1850/(2526) batches INFO:2020-12-11_20:46:01.719:eval.py:120: model.ckpt-30699, predicted 1860/(2526) batches INFO:2020-12-11_20:46:02.334:eval.py:120: model.ckpt-30699, predicted 1870/(2526) batches INFO:2020-12-11_20:46:02.926:eval.py:120: model.ckpt-30699, predicted 1880/(2526) batches INFO:2020-12-11_20:46:03.527:eval.py:120: model.ckpt-30699, predicted 1890/(2526) batches INFO:2020-12-11_20:46:04.133:eval.py:120: model.ckpt-30699, predicted 1900/(2526) batches INFO:2020-12-11_20:46:04.753:eval.py:120: model.ckpt-30699, predicted 1910/(2526) batches INFO:2020-12-11_20:46:05.373:eval.py:120: model.ckpt-30699, predicted 1920/(2526) batches INFO:2020-12-11_20:46:05.985:eval.py:120: model.ckpt-30699, predicted 1930/(2526) batches INFO:2020-12-11_20:46:06.582:eval.py:120: model.ckpt-30699, predicted 1940/(2526) batches INFO:2020-12-11_20:46:07.167:eval.py:120: model.ckpt-30699, predicted 1950/(2526) batches INFO:2020-12-11_20:46:07.775:eval.py:120: model.ckpt-30699, predicted 1960/(2526) batches INFO:2020-12-11_20:46:08.378:eval.py:120: model.ckpt-30699, predicted 1970/(2526) batches INFO:2020-12-11_20:46:08.980:eval.py:120: model.ckpt-30699, predicted 1980/(2526) batches INFO:2020-12-11_20:46:09.579:eval.py:120: model.ckpt-30699, predicted 1990/(2526) batches INFO:2020-12-11_20:46:10.189:eval.py:120: model.ckpt-30699, predicted 2000/(2526) batches INFO:2020-12-11_20:46:10.781:eval.py:120: model.ckpt-30699, predicted 2010/(2526) batches INFO:2020-12-11_20:46:11.374:eval.py:120: model.ckpt-30699, predicted 2020/(2526) batches INFO:2020-12-11_20:46:11.977:eval.py:120: model.ckpt-30699, predicted 2030/(2526) batches INFO:2020-12-11_20:46:12.575:eval.py:120: model.ckpt-30699, predicted 2040/(2526) batches INFO:2020-12-11_20:46:13.183:eval.py:120: model.ckpt-30699, predicted 2050/(2526) batches INFO:2020-12-11_20:46:13.785:eval.py:120: model.ckpt-30699, predicted 2060/(2526) batches INFO:2020-12-11_20:46:14.392:eval.py:120: model.ckpt-30699, predicted 2070/(2526) batches INFO:2020-12-11_20:46:14.982:eval.py:120: model.ckpt-30699, predicted 2080/(2526) batches INFO:2020-12-11_20:46:15.576:eval.py:120: model.ckpt-30699, predicted 2090/(2526) batches INFO:2020-12-11_20:46:16.170:eval.py:120: model.ckpt-30699, predicted 2100/(2526) batches INFO:2020-12-11_20:46:16.779:eval.py:120: model.ckpt-30699, predicted 2110/(2526) batches INFO:2020-12-11_20:46:17.387:eval.py:120: model.ckpt-30699, predicted 2120/(2526) batches INFO:2020-12-11_20:46:17.990:eval.py:120: model.ckpt-30699, predicted 2130/(2526) batches INFO:2020-12-11_20:46:18.580:eval.py:120: model.ckpt-30699, predicted 2140/(2526) batches INFO:2020-12-11_20:46:19.163:eval.py:120: model.ckpt-30699, predicted 2150/(2526) batches INFO:2020-12-11_20:46:19.768:eval.py:120: model.ckpt-30699, predicted 2160/(2526) batches INFO:2020-12-11_20:46:20.362:eval.py:120: model.ckpt-30699, predicted 2170/(2526) batches INFO:2020-12-11_20:46:20.967:eval.py:120: model.ckpt-30699, predicted 2180/(2526) batches INFO:2020-12-11_20:46:21.576:eval.py:120: model.ckpt-30699, predicted 2190/(2526) batches INFO:2020-12-11_20:46:22.174:eval.py:120: model.ckpt-30699, predicted 2200/(2526) batches INFO:2020-12-11_20:46:22.770:eval.py:120: model.ckpt-30699, predicted 2210/(2526) batches INFO:2020-12-11_20:46:23.367:eval.py:120: model.ckpt-30699, predicted 2220/(2526) batches INFO:2020-12-11_20:46:23.978:eval.py:120: model.ckpt-30699, predicted 2230/(2526) batches INFO:2020-12-11_20:46:24.589:eval.py:120: model.ckpt-30699, predicted 2240/(2526) batches INFO:2020-12-11_20:46:25.195:eval.py:120: model.ckpt-30699, predicted 2250/(2526) batches INFO:2020-12-11_20:46:25.803:eval.py:120: model.ckpt-30699, predicted 2260/(2526) batches INFO:2020-12-11_20:46:26.399:eval.py:120: model.ckpt-30699, predicted 2270/(2526) batches INFO:2020-12-11_20:46:26.989:eval.py:120: model.ckpt-30699, predicted 2280/(2526) batches INFO:2020-12-11_20:46:27.601:eval.py:120: model.ckpt-30699, predicted 2290/(2526) batches INFO:2020-12-11_20:46:28.199:eval.py:120: model.ckpt-30699, predicted 2300/(2526) batches INFO:2020-12-11_20:46:28.787:eval.py:120: model.ckpt-30699, predicted 2310/(2526) batches INFO:2020-12-11_20:46:29.396:eval.py:120: model.ckpt-30699, predicted 2320/(2526) batches INFO:2020-12-11_20:46:30.004:eval.py:120: model.ckpt-30699, predicted 2330/(2526) batches INFO:2020-12-11_20:46:30.597:eval.py:120: model.ckpt-30699, predicted 2340/(2526) batches INFO:2020-12-11_20:46:31.176:eval.py:120: model.ckpt-30699, predicted 2350/(2526) batches INFO:2020-12-11_20:46:31.774:eval.py:120: model.ckpt-30699, predicted 2360/(2526) batches INFO:2020-12-11_20:46:32.393:eval.py:120: model.ckpt-30699, predicted 2370/(2526) batches INFO:2020-12-11_20:46:33.008:eval.py:120: model.ckpt-30699, predicted 2380/(2526) batches INFO:2020-12-11_20:46:33.609:eval.py:120: model.ckpt-30699, predicted 2390/(2526) batches INFO:2020-12-11_20:46:34.216:eval.py:120: model.ckpt-30699, predicted 2400/(2526) batches INFO:2020-12-11_20:46:34.804:eval.py:120: model.ckpt-30699, predicted 2410/(2526) batches INFO:2020-12-11_20:46:35.399:eval.py:120: model.ckpt-30699, predicted 2420/(2526) batches INFO:2020-12-11_20:46:36.008:eval.py:120: model.ckpt-30699, predicted 2430/(2526) batches INFO:2020-12-11_20:46:36.605:eval.py:120: model.ckpt-30699, predicted 2440/(2526) batches INFO:2020-12-11_20:46:37.216:eval.py:120: model.ckpt-30699, predicted 2450/(2526) batches INFO:2020-12-11_20:46:37.819:eval.py:120: model.ckpt-30699, predicted 2460/(2526) batches INFO:2020-12-11_20:46:38.407:eval.py:120: model.ckpt-30699, predicted 2470/(2526) batches INFO:2020-12-11_20:46:38.988:eval.py:120: model.ckpt-30699, predicted 2480/(2526) batches INFO:2020-12-11_20:46:39.582:eval.py:120: model.ckpt-30699, predicted 2490/(2526) batches INFO:2020-12-11_20:46:40.195:eval.py:120: model.ckpt-30699, predicted 2500/(2526) batches INFO:2020-12-11_20:46:40.807:eval.py:120: model.ckpt-30699, predicted 2510/(2526) batches INFO:2020-12-11_20:46:41.421:eval.py:120: model.ckpt-30699, predicted 2520/(2526) batches INFO:2020-12-11_20:46:43.676:eval.py:67: model.ckpt-30699, accuracy=66.48775661637399, metric=0.19124932847848147, f1=0.19124932847848147 INFO:2020-12-11_20:46:43.676:eval.py:70: evaluation done, took 0:02:44.879326 s! INFO:2020-12-11_20:46:43.676:eval.py:71: final_predictions saved to: /data1/hwt/deformer/data/predictions/bert/qqp-dev-predictions.json_


the prediction results of QQP dataset (file qqp-dev-predictions.json) are mostly 0

I wonder if this error just because I trained the model on GPU?

can you give me some advice , thanks !

winston52 avatar Dec 15 '20 01:12 winston52