albert
albert copied to clipboard
When fine-tuning ALBERT on SQUAD 1.1 - TypeError: Expected binary or unicode string, got None
I've tried to run the 'run_squad_v1' script exactly as per mentioned but experienced TypeError: Expected binary or unicode string, got None
Can you give the whole command, so that we can attempt to reproduce it? Also, if you could give your TF and PY version numbers.
Here is the command: python3 -m run_squad_v1 --albert_config_file='/home/vendy/Desktop/ALBERT-master/albert_base/albert_config.json' --output_dir='/home/vendy/Desktop/ALBERT-master/tmp' --train_file='/home/vendy/Desktop/ALBERT-master/SQUAD data/train-v1.1.json' --predict_file='/home/vendy/Desktop/ALBERT-master/SQUAD data/dev-v1.1.json' --spm_model_file='/home/vendy/Desktop/ALBERT-master/albert_base/30k-clean.model' --do_lower_case --max_seq_length=384 --doc_stride=128 --max_query_length=64 --do_train=true --do_predict=true --train_batch_size=48 --predict_batch_size=8 --learning_rate=5e-5 --num_train_epochs=2.0 --warmup_proportion=.1 --save_checkpoints_steps=5000 --n_best_size=20 --max_answer_length=30
PY: 3.6 TF: 1.15
@vendyv I ran into this same issue, then I realised the script expects you to supply the train_feature_file parameter. You can give it a path to a file (*.tfrecord) and if it doesn't exist, it will create one for you.
Here is the command: python3 -m run_squad_v1 --albert_config_file='/home/vendy/Desktop/ALBERT-master/albert_base/albert_config.json' --output_dir='/home/vendy/Desktop/ALBERT-master/tmp' --train_file='/home/vendy/Desktop/ALBERT-master/SQUAD data/train-v1.1.json' --predict_file='/home/vendy/Desktop/ALBERT-master/SQUAD data/dev-v1.1.json' --spm_model_file='/home/vendy/Desktop/ALBERT-master/albert_base/30k-clean.model' --do_lower_case --max_seq_length=384 --doc_stride=128 --max_query_length=64 --do_train=true --do_predict=true --train_batch_size=48 --predict_batch_size=8 --learning_rate=5e-5 --num_train_epochs=2.0 --warmup_proportion=.1 --save_checkpoints_steps=5000 --n_best_size=20 --max_answer_length=30
PY: 3.6 TF: 1.15
hello,i have occured the same error with yours.Did you solve it?
@MaybeLL Do you have the train_feature_file parameter defined? It fixed it for me when I added the parameter
@MaybeLL Do you have the train_feature_file parameter defined? It fixed it for me when I added the parameter
i don't know if the train_feature_file is same with the train_file . but actually i only have the train_file parameter for the run_race.py. it looks loke :
--train_file=/home/dy/Project/ALBERT/train_file/train.tfrecord
and here is the comment about it:
flags.DEFINE_string("train_file", None, "path to preprocessed tfrecord file. " "The file will be generated if not exst.")
Below is the error I am facing while running the command:
python -m run_squad_v1 \
> --albert_config_file=/media/xxxx/NewVolume/ALBERT/albert_base/albert_config.json \
> --output_dir=/media/xxxx/NewVolume/ALBERT/tmp \
> --train_file=/media/xxxx/NewVolume/ALBERT/data1/train-v1.1.json \
> --predict_file=/media/xxxx/NewVolume/ALBERT/data1/dev-v1.1.json \
> --spm_model_file=/media/xxxx/NewVolume/ALBERT/albert_base/30k-clean.model \
> --do_lower_case \
> --max_seq_length=384 \
> --doc_stride=128 \
> --max_query_length=64 \
> --do_train=false \
> --do_predict=true \
> --train_batch_size=48 \
> --predict_batch_size=8 \
> --learning_rate=5e-5 \
> --num_train_epochs=2.0 \
> --warmup_proportion=.1 \
> --save_checkpoints_steps=5000 \
> --n_best_size=20 \
> --max_answer_length=30
WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
W0113 15:12:16.637617 140307062036288 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
INFO:tensorflow:loading sentence piece model
I0113 15:12:16.637814 140307062036288 tokenization.py:240] loading sentence piece model
WARNING:tensorflow:Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f9b633440d0>) includes params argument, but params are not passed to Estimator.
W0113 15:12:17.200998 140307062036288 estimator.py:1994] Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f9b633440d0>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9b66fd70b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None}
I0113 15:12:17.201757 140307062036288 estimator.py:212] Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9b66fd70b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None}
INFO:tensorflow:_TPUContext: eval_on_tpu True
I0113 15:12:17.202082 140307062036288 tpu_context.py:220] _TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
W0113 15:12:17.202302 140307062036288 tpu_context.py:222] eval_on_tpu ignored because use_tpu is False.
WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead.
W0113 15:12:17.202427 140307062036288 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead.
WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.
W0113 15:12:17.317886 140307062036288 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.
Traceback (most recent call last):
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 478, in <module>
tf.compat.v1.app.run()
File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 309, in main
if (tf.gfile.Exists(FLAGS.predict_feature_file) and tf.gfile.Exists(
File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 262, in file_exists
return file_exists_v2(filename)
File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 280, in file_exists_v2
pywrap_tensorflow.FileExists(compat.as_bytes(path))
File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/util/compat.py", line 71, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got None
@MaybeLL Do you have the train_feature_file parameter defined? It fixed it for me when I added the parameter
i don't know if the train_feature_file is same with the train_file . but actually i only have the train_file parameter for the run_race.py. it looks loke :
--train_file=/home/dy/Project/ALBERT/train_file/train.tfrecord
and here is the comment about it:
flags.DEFINE_string("train_file", None, "path to preprocessed tfrecord file. " "The file will be generated if not exst.")
You still need to supply a path to a tfrecord file for train_feature_file. It is different from feature_file. You can specify a filename that does not exist yet, and it will create one for you.
Below is the error I am facing while running the command:
python -m run_squad_v1 \ > --albert_config_file=/media/xxxx/NewVolume/ALBERT/albert_base/albert_config.json \ > --output_dir=/media/xxxx/NewVolume/ALBERT/tmp \ > --train_file=/media/xxxx/NewVolume/ALBERT/data1/train-v1.1.json \ > --predict_file=/media/xxxx/NewVolume/ALBERT/data1/dev-v1.1.json \ > --spm_model_file=/media/xxxx/NewVolume/ALBERT/albert_base/30k-clean.model \ > --do_lower_case \ > --max_seq_length=384 \ > --doc_stride=128 \ > --max_query_length=64 \ > --do_train=false \ > --do_predict=true \ > --train_batch_size=48 \ > --predict_batch_size=8 \ > --learning_rate=5e-5 \ > --num_train_epochs=2.0 \ > --warmup_proportion=.1 \ > --save_checkpoints_steps=5000 \ > --n_best_size=20 \ > --max_answer_length=30 WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead. W0113 15:12:16.637617 140307062036288 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead. INFO:tensorflow:loading sentence piece model I0113 15:12:16.637814 140307062036288 tokenization.py:240] loading sentence piece model WARNING:tensorflow:Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f9b633440d0>) includes params argument, but params are not passed to Estimator. W0113 15:12:17.200998 140307062036288 estimator.py:1994] Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f9b633440d0>) includes params argument, but params are not passed to Estimator. INFO:tensorflow:Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9b66fd70b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None} I0113 15:12:17.201757 140307062036288 estimator.py:212] Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9b66fd70b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None} INFO:tensorflow:_TPUContext: eval_on_tpu True I0113 15:12:17.202082 140307062036288 tpu_context.py:220] _TPUContext: eval_on_tpu True WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False. W0113 15:12:17.202302 140307062036288 tpu_context.py:222] eval_on_tpu ignored because use_tpu is False. WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead. W0113 15:12:17.202427 140307062036288 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead. WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead. W0113 15:12:17.317886 140307062036288 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead. Traceback (most recent call last): File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 478, in <module> tf.compat.v1.app.run() File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run _run_main(main, args) File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 309, in main if (tf.gfile.Exists(FLAGS.predict_feature_file) and tf.gfile.Exists( File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 262, in file_exists return file_exists_v2(filename) File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 280, in file_exists_v2 pywrap_tensorflow.FileExists(compat.as_bytes(path)) File "/home/xxxx/.local/lib/python3.6/site-packages/tensorflow_core/python/util/compat.py", line 71, in as_bytes (bytes_or_text,)) TypeError: Expected binary or unicode string, got None
Try
python -m run_squad_v1 \
--albert_config_file=/media/xxxx/NewVolume/ALBERT/albert_base/albert_config.json \
--output_dir=/media/xxxx/NewVolume/ALBERT/tmp \
--train_file=/media/xxxx/NewVolume/ALBERT/data1/train-v1.1.json \
--train_feature_file=/media/xxxx/NewVolume/ALBERT/data1/feature_file.tfrecord \
--predict_file=/media/xxxx/NewVolume/ALBERT/data1/dev-v1.1.json \
--spm_model_file=/media/xxxx/NewVolume/ALBERT/albert_base/30k-clean.model \
--do_lower_case \
--max_seq_length=384 \
--doc_stride=128 \
--max_query_length=64 \
--do_train=false \
--do_predict=true \
--train_batch_size=48 \
--predict_batch_size=8 \
--learning_rate=5e-5 \
--num_train_epochs=2.0 \
--warmup_proportion=.1 \
--save_checkpoints_steps=5000 \
--n_best_size=20 \
--max_answer_length=30
@spark-ming
Yes i have tried the same but still I am facing an unicode error:
python -m run_squad_v1 --albert_config_file=/media/xxxx/NewVolume/ALBERT/albert_base/albert_config.json --output_dir=/media/xxxx/NewVolume/ALBERT/tmp --train_file=/media/xxxx/NewVolume/ALBERT/data1/train-v1.1.json --train_feature_file=/media/xxxx/NewVolume/ALBERT/data1/feature_file.tfrecord --predict_file=/media/xxxx/NewVolume/ALBERT/data1/dev-v1.1.json --spm_model_file=/media/xxxx/NewVolume/ALBERT/albert_base/30k-clean.model --do_lower_case --max_seq_length=384 --doc_stride=128 --max_query_length=64 --do_train=false --do_predict=true --train_batch_size=48 --predict_batch_size=8 --learning_rate=5e-5 --num_train_epochs=2.0 --warmup_proportion=.1 --save_checkpoints_steps=5000 --n_best_size=20 --max_answer_length=30
WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
W0114 16:47:02.894027 140206005626688 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
INFO:tensorflow:loading sentence piece model
I0114 16:47:02.894243 140206005626688 tokenization.py:240] loading sentence piece model
WARNING:tensorflow:Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f83d77c9ae8>) includes params argument, but params are not passed to Estimator.
W0114 16:47:03.487462 140206005626688 estimator.py:1994] Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f83d77c9ae8>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f83db463208>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None}
I0114 16:47:03.488247 140206005626688 estimator.py:212] Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f83db463208>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None}
INFO:tensorflow:_TPUContext: eval_on_tpu True
I0114 16:47:03.488590 140206005626688 tpu_context.py:220] _TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
W0114 16:47:03.488810 140206005626688 tpu_context.py:222] eval_on_tpu ignored because use_tpu is False.
WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead.
W0114 16:47:03.488927 140206005626688 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead.
WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.
W0114 16:47:03.604506 140206005626688 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.
Traceback (most recent call last):
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 478, in <module>
tf.compat.v1.app.run()
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 309, in main
if (tf.gfile.Exists(FLAGS.predict_feature_file) and tf.gfile.Exists(
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 262, in file_exists
return file_exists_v2(filename)
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 280, in file_exists_v2
pywrap_tensorflow.FileExists(compat.as_bytes(path))
File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/util/compat.py", line 71, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got None
and all the requirements specified in requirements.txt file has been installed in an environment, is it a problem with the config file or data files?
@spark-ming
Yes i have tried the same but still I am facing an unicode error:
python -m run_squad_v1 --albert_config_file=/media/xxxx/NewVolume/ALBERT/albert_base/albert_config.json --output_dir=/media/xxxx/NewVolume/ALBERT/tmp --train_file=/media/xxxx/NewVolume/ALBERT/data1/train-v1.1.json --train_feature_file=/media/xxxx/NewVolume/ALBERT/data1/feature_file.tfrecord --predict_file=/media/xxxx/NewVolume/ALBERT/data1/dev-v1.1.json --spm_model_file=/media/xxxx/NewVolume/ALBERT/albert_base/30k-clean.model --do_lower_case --max_seq_length=384 --doc_stride=128 --max_query_length=64 --do_train=false --do_predict=true --train_batch_size=48 --predict_batch_size=8 --learning_rate=5e-5 --num_train_epochs=2.0 --warmup_proportion=.1 --save_checkpoints_steps=5000 --n_best_size=20 --max_answer_length=30 WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead. W0114 16:47:02.894027 140206005626688 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:206: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead. INFO:tensorflow:loading sentence piece model I0114 16:47:02.894243 140206005626688 tokenization.py:240] loading sentence piece model WARNING:tensorflow:Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f83d77c9ae8>) includes params argument, but params are not passed to Estimator. W0114 16:47:03.487462 140206005626688 estimator.py:1994] Estimator's model_fn (<function v1_model_fn_builder.<locals>.model_fn at 0x7f83d77c9ae8>) includes params argument, but params are not passed to Estimator. INFO:tensorflow:Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f83db463208>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None} I0114 16:47:03.488247 140206005626688 estimator.py:212] Using config: {'_model_dir': '/media/xxxx/NewVolume/ALBERT/tmp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f83db463208>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None} INFO:tensorflow:_TPUContext: eval_on_tpu True I0114 16:47:03.488590 140206005626688 tpu_context.py:220] _TPUContext: eval_on_tpu True WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False. W0114 16:47:03.488810 140206005626688 tpu_context.py:222] eval_on_tpu ignored because use_tpu is False. WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead. W0114 16:47:03.488927 140206005626688 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:303: The name tf.gfile.Open is deprecated. Please use tf.io.gfile.GFile instead. WARNING:tensorflow:From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead. W0114 16:47:03.604506 140206005626688 module_wrapper.py:139] From /media/xxxx/NewVolume/ALBERT/run_squad_v1.py:309: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead. Traceback (most recent call last): File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 478, in <module> tf.compat.v1.app.run() File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run _run_main(main, args) File "/home/xxxx/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "/media/xxxx/NewVolume/ALBERT/run_squad_v1.py", line 309, in main if (tf.gfile.Exists(FLAGS.predict_feature_file) and tf.gfile.Exists( File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 262, in file_exists return file_exists_v2(filename) File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 280, in file_exists_v2 pywrap_tensorflow.FileExists(compat.as_bytes(path)) File "/home/xxxx/anaconda3/envs/albert/lib/python3.6/site-packages/tensorflow_core/python/util/compat.py", line 71, in as_bytes (bytes_or_text,)) TypeError: Expected binary or unicode string, got None
and all the requirements specified in requirements.txt file has been installed in an environment, is it a problem with the config file or data files?
I am having this issue as well, did you find a solution?
I can't fix this right now, but I believe it is a bug with the default value of --predict_feature_file
. It defaults to None, but should default to empty string: https://github.com/google-research/ALBERT/blob/master/run_squad_v1.py#L77
Provide name of predict feature file like: --predict_feature_file=