cail2019_track2 copied to clipboard
| | m13021933043 邮箱:[email protected] |
Signature is customized by Netease Mail Master
在2020年07月08日 19:24,zhouyang-bigdata 写道:
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
一番折腾,后腾讯云换了个系统镜像,这应该是在gpu上运行了。请问这个训练一般耗时多久? 日志如下:
INFO:tensorflow: name = bert/encoder/layer_11/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bidirectional_rnn/fw/basic_lstm_cell/kernel:0, shape = (968, 800) INFO:tensorflow: name = bidirectional_rnn/fw/basic_lstm_cell/bias:0, shape = (800,) INFO:tensorflow: name = bidirectional_rnn/bw/basic_lstm_cell/kernel:0, shape = (968, 800) INFO:tensorflow: name = bidirectional_rnn/bw/basic_lstm_cell/bias:0, shape = (800,) INFO:tensorflow: name = u_omega:0, shape = (1168,) INFO:tensorflow: name = output_weights:0, shape = (20, 1168) INFO:tensorflow: name = output_bias:0, shape = (20,) WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/training/ div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. 2020-07-09 09:42:55.471208: I tensorflow/core/platform/] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-07-09 09:42:55.649475: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-09 09:42:55.650252: I tensorflow/compiler/xla/service/] XLA service 0x8a58eb0 executing computations on platform CUDA. Devices: 2020-07-09 09:42:55.650293: I tensorflow/compiler/xla/service/] StreamExecutor device (0): Tesla V100-SXM2-32GB, Compute Capability 7.0 2020-07-09 09:42:55.663763: I tensorflow/core/platform/profile_utils/] CPU Frequency: 2500000000 Hz 2020-07-09 09:42:55.664574: I tensorflow/compiler/xla/service/] XLA service 0x9a2a320 executing computations on platform Host. Devices: 2020-07-09 09:42:55.664608: I tensorflow/compiler/xla/service/] StreamExecutor device (0):
, 2020-07-09 09:42:55.665581: I tensorflow/core/common_runtime/gpu/] Found device 0 with properties: name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:00:08.0 totalMemory: 31.72GiB freeMemory: 31.31GiB 2020-07-09 09:42:55.665603: I tensorflow/core/common_runtime/gpu/] Adding visible gpu devices: 0 2020-07-09 09:42:55.666469: I tensorflow/core/common_runtime/gpu/] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-07-09 09:42:55.666486: I tensorflow/core/common_runtime/gpu/] 0 2020-07-09 09:42:55.666493: I tensorflow/core/common_runtime/gpu/] 0: N 2020-07-09 09:42:55.666986: I tensorflow/core/common_runtime/gpu/] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 30459 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:00:08.0, compute capability: 7.0) INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Saving checkpoints for 0 into ckpt/divorce/model.ckpt. 2020-07-09 09:43:22.977919: I tensorflow/stream_executor/] successfully opened CUDA library locally INFO:tensorflow:global_step/sec: 2.03341 INFO:tensorflow:examples/sec: 65.0693 INFO:tensorflow:global_step/sec: 2.27393 INFO:tensorflow:examples/sec: 72.7656 INFO:tensorflow:global_step/sec: 2.27549 INFO:tensorflow:examples/sec: 72.8157 INFO:tensorflow:global_step/sec: 2.2709 INFO:tensorflow:examples/sec: 72.6686 INFO:tensorflow:global_step/sec: 2.27153 INFO:tensorflow:examples/sec: 72.6891
| | m13021933043 邮箱:[email protected] |
Signature is customized by Netease Mail Master
在2020年07月09日 09:50,zhouyang-bigdata 写道:
一番折腾,后腾讯云换了个系统镜像,这应该是在gpu上运行了。请问这个训练一般耗时多久? 日志如下:
INFO:tensorflow: name = bert/encoder/layer_11/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bidirectional_rnn/fw/basic_lstm_cell/kernel:0, shape = (968, 800) INFO:tensorflow: name = bidirectional_rnn/fw/basic_lstm_cell/bias:0, shape = (800,) INFO:tensorflow: name = bidirectional_rnn/bw/basic_lstm_cell/kernel:0, shape = (968, 800) INFO:tensorflow: name = bidirectional_rnn/bw/basic_lstm_cell/bias:0, shape = (800,) INFO:tensorflow: name = u_omega:0, shape = (1168,) INFO:tensorflow: name = output_weights:0, shape = (20, 1168) INFO:tensorflow: name = output_bias:0, shape = (20,) WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/training/ div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. 2020-07-09 09:42:55.471208: I tensorflow/core/platform/] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-07-09 09:42:55.649475: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-09 09:42:55.650252: I tensorflow/compiler/xla/service/] XLA service 0x8a58eb0 executing computations on platform CUDA. Devices: 2020-07-09 09:42:55.650293: I tensorflow/compiler/xla/service/] StreamExecutor device (0): Tesla V100-SXM2-32GB, Compute Capability 7.0 2020-07-09 09:42:55.663763: I tensorflow/core/platform/profile_utils/] CPU Frequency: 2500000000 Hz 2020-07-09 09:42:55.664574: I tensorflow/compiler/xla/service/] XLA service 0x9a2a320 executing computations on platform Host. Devices: 2020-07-09 09:42:55.664608: I tensorflow/compiler/xla/service/] StreamExecutor device (0): , 2020-07-09 09:42:55.665581: I tensorflow/core/common_runtime/gpu/] Found device 0 with properties: name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:00:08.0 totalMemory: 31.72GiB freeMemory: 31.31GiB 2020-07-09 09:42:55.665603: I tensorflow/core/common_runtime/gpu/] Adding visible gpu devices: 0 2020-07-09 09:42:55.666469: I tensorflow/core/common_runtime/gpu/] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-07-09 09:42:55.666486: I tensorflow/core/common_runtime/gpu/] 0 2020-07-09 09:42:55.666493: I tensorflow/core/common_runtime/gpu/] 0: N 2020-07-09 09:42:55.666986: I tensorflow/core/common_runtime/gpu/] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 30459 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:00:08.0, compute capability: 7.0) INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Saving checkpoints for 0 into ckpt/divorce/model.ckpt. 2020-07-09 09:43:22.977919: I tensorflow/stream_executor/] successfully opened CUDA library locally INFO:tensorflow:global_step/sec: 2.03341 INFO:tensorflow:examples/sec: 65.0693 INFO:tensorflow:global_step/sec: 2.27393 INFO:tensorflow:examples/sec: 72.7656 INFO:tensorflow:global_step/sec: 2.27549 INFO:tensorflow:examples/sec: 72.8157 INFO:tensorflow:global_step/sec: 2.2709 INFO:tensorflow:examples/sec: 72.6686 INFO:tensorflow:global_step/sec: 2.27153 INFO:tensorflow:examples/sec: 72.6891
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
| | m13021933043 邮箱:[email protected] |
Signature is customized by Netease Mail Master
在2020年07月09日 10:12,zhouyang-bigdata 写道:
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
训练出来,准确率有0.83. 是训练集和测试集放一起训练了吧? 日志:
99%|█████████▉| 249/252 [01:36<00:01, 2.76it/s] 99%|█████████▉| 250/252 [01:36<00:00, 2.77it/s] 100%|█████████▉| 251/252 [01:37<00:00, 2.76it/s] 100%|██████████| 252/252 [01:37<00:00, 2.76it/s] 100%|██████████| 252/252 [01:37<00:00, 2.58it/s] INFO:root:模型预测结束
INFO:root:{'1': 0.96, '2': 0.92, '3': 0.91, '4': 0.93, '5': 0.91, '6': 0.93, '7': 0.93, '8': 0.97, '9': 0.98, '10': 0.88, '11': 0.84, '12': 0.25, '13': 0.83, '14': 0.47, '15': 0.82, '16': 0.79, '17': 0.72, '18': 0.03, '19': 0.32, '20': 0.64} INFO:root:总评分如下: 0.8298107041994647
| | m13021933043 邮箱:[email protected] |
Signature is customized by Netease Mail Master
在2020年07月09日 10:53,zhouyang-bigdata 写道:
训练出来,准确率有0.83. 是训练集和测试集放一起训练了吧? 日志:
99%|█████████▉| 249/252 [01:36<00:01, 2.76it/s] 99%|█████████▉| 250/252 [01:36<00:00, 2.77it/s] 100%|█████████▉| 251/252 [01:37<00:00, 2.76it/s] 100%|██████████| 252/252 [01:37<00:00, 2.76it/s] 100%|██████████| 252/252 [01:37<00:00, 2.58it/s] INFO:root:模型预测结束
INFO:root:{'1': 0.96, '2': 0.92, '3': 0.91, '4': 0.93, '5': 0.91, '6': 0.93, '7': 0.93, '8': 0.97, '9': 0.98, '10': 0.88, '11': 0.84, '12': 0.25, '13': 0.83, '14': 0.47, '15': 0.82, '16': 0.79, '17': 0.72, '18': 0.03, '19': 0.32, '20': 0.64} INFO:root:总评分如下: 0.8298107041994647
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
应该是训练集(divorce)比较少的原因吧。我看到训练集只有1.93M。而我以前下载的官方训练集,有6.08M。 请问一下,怎样设置多个GPU一起训练?
用了6.08M 的数据(divorce)后,准确率只降了0.1. 好神奇……请问这个是合理的吗?没有重现你的0.73的准确率。 日志如下:
98%|█████████▊| 247/252 [01:35<00:01, 2.76it/s] 98%|█████████▊| 248/252 [01:36<00:01, 2.75it/s] 99%|█████████▉| 249/252 [01:36<00:01, 2.75it/s] 99%|█████████▉| 250/252 [01:37<00:00, 2.74it/s] 100%|█████████▉| 251/252 [01:37<00:00, 2.73it/s] 100%|██████████| 252/252 [01:37<00:00, 2.73it/s] INFO:root:模型预测结束
INFO:root:{'1': 0.95, '2': 0.91, '3': 0.91, '4': 0.94, '5': 0.9, '6': 0.92, '7': 0.92, '8': 0.96, '9': 0.98, '10': 0.87, '11': 0.84, '12': 0.21, '13': 0.8, '14': 0.31, '15': 0.81, '16': 0.77, '17': 0.64, '18': 0.0, '19': 0.2, '20': 0.61} INFO:root:总评分如下: 0.811298028037752
用了6.08M 的数据(divorce)后,准确率只降了0.1. 好神奇……请问这个是合理的吗?没有重现你的0.73的准确率。
请问一下2个问题: (1)请问你之前用的训练数据是多大的?我想重现你的结果 (2)请问一下,怎样设置多个GPU一起训练?
| | m13021933043 邮箱:[email protected] |
Signature is customized by Netease Mail Master
在2020年07月10日 18:00,zhouyang-bigdata 写道:
请问一下2个问题: (1)请问你之前用的训练数据是多大的? (2)请问一下,怎样设置多个GPU一起训练?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
'11': 0.0, '12': 0.0, '13': 0.0, '14': 0.0, '15': 0.0, '16': 0.0, '17': 0.0, '18': 0.0, '19': 0.0, '20': 0.0
92%|█████████▏| 47/51 [01:11<00:05, 1.45s/it] 94%|█████████▍| 48/51 [01:13<00:04, 1.44s/it] 96%|█████████▌| 49/51 [01:14<00:02, 1.45s/it] 98%|█████████▊| 50/51 [01:15<00:01, 1.45s/it] 100%|██████████| 51/51 [01:17<00:00, 1.45s/it] INFO:root:模型预测结束
INFO:root:{'1': 0.87, '2': 0.81, '3': 0.8, '4': 0.76, '5': 0.8, '6': 0.6, '7': 0.86, '8': 0.96, '9': 0.82, '10': 0.89, '11': 0.0, '12': 0.0, '13': 0.0, '14': 0.0, '15': 0.0, '16': 0.0, '17': 0.0, '18': 0.0, '19': 0.0, '20': 0.0} INFO:root:总评分如下: 0.6121632632937084
if name == 'main':
task = "loan" ##这里传入切分好的测试数据,这里由于是整理代码做测试,随便导入训练数据集测试下 sentences, labels = load_file("data/loan/data_small_selected.json") #sentences, labels = load_file("my_test_data.json")"开始载入bert模型") model_1 = BERTModel(task=task, pb_model="pb/loan/model.pb", tagDir="data/loan/tags.txt", threshold=[0.5] * 20, vocab_file="chinese_L-12_H-768_A-12/vocab.txt")"bert模型载入完毕,开始进行预测!!!\n")"模型开始预测\n") predicts_1 = model_1.getAllResult(sentences) print(predicts_1)"结果:\n")"模型预测结束\n")"模型每个类别f值计算如下:\n") score_1, f1_1 = evaluate(predict_labels=predicts_1, target_labels=labels, tag_dir="data/loan/tags.txt")"总评分如下: {}".format(score_1))
用了6.08M 的数据(divorce)后,准确率只降了0.1. 好神奇……请问这个是合理的吗?没有重现你的0.73的准确率。
这个很可能是我测试用的数据不对。改为data_small_selected.json后,试了2次,是0.71 ,很接近了,不过,0.73没重现过。
| | m13021933043 邮箱:[email protected] |
Signature is customized by Netease Mail Master
在2020年07月15日 09:55,zhouyang-bigdata 写道:
用了6.08M 的数据(divorce)后,准确率只降了0.1. 好神奇……请问这个是合理的吗?没有重现你的0.73的准确率。
这个很可能是我测试用的数据不对。改为data_small_selected.json后,试了2次,是0.71 ,很接近了,不过,0.73没重现过。
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
测试数据都不一样...我的成绩是官网测试成绩,而且一些trick代码我没有发在github,只是readme写了介绍 | | m13021933043 邮箱:[email protected] | Signature is customized by Netease Mail Master 在2020年07月15日 09:55,zhouyang-bigdata 写道: 这应该是我训练数据文件名的问题。我多训练几遍再看看。 用了6.08M 的数据(divorce)后,准确率只降了0.1. 好神奇……请问这个是合理的吗?没有重现你的0.73的准确率。 这个很可能是我测试用的数据不对。改为data_small_selected.json后,试了2次,是0.71 ,很接近了,不过,0.73没重现过。 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
'11': 0.0, '12': 0.0, '13': 0.0, '14': 0.0, '15': 0.0, '16': 0.0, '17': 0.0, '18': 0.0, '19': 0.0, '20': 0.0
我qq 2648759823 ,你qq多少。
'11': 0.0, '12': 0.0, '13': 0.0, '14': 0.0, '15': 0.0, '16': 0.0, '17': 0.0, '18': 0.0, '19': 0.0, '20': 0.0
你好,可以qq聊下吗?请教一下。我qq 2648759823