GPT2 icon indicating copy to clipboard operation
GPT2 copied to clipboard

Has anyone managed to work it on Windows? Which OS did you use to make it work?

Open FurkanGozukara opened this issue 6 years ago • 2 comments

I have windows 10, x64, Core i7 2600 K CPU, 32 ram memory, GTX 1050 Ti GPU

I have installed latest Phyton and Tensorflow

Also run these commands

1) pip3 install tensorflow-gpu regex

2) pip3 install requests tqdm

3) cd GPT2 folder (cloned via bash)

4) python download_model.py PrettyBig

Everything I believe is ready however i am not able to make it work

Here my configurations and what errors I am getting

Main folder

image

PrettyBig folder

image

PrettyBig.json - file paths are correct and working

image

Here the command line I have used

C:\GPT2>python main.py --model PrettyBig.json --predict_text "Pikachu"

At first it runs several minutes with around 70% CPU usage and above 2 GB ram usage

Here the full command line result of the above command

C:\GPT2>python main.py --model PrettyBig.json --predict_text "Pikachu" {'n_head': 16, 'encoder_path': 'C:\GPT2\encoder', 'n_vocab': 50257, 'embed_dropout': 0.0, 'lr': 0.00025, 'warmup_steps': 2000, 'weight_decay': 0.01, 'beta1': 0.9, 'beta2': 0.98, 'epsilon': 1e-09, 'opt_name': 'adam', 'train_batch_size': 256, 'attn_dropout': 0.0, 'train_steps': 10000, 'eval_steps': 10, 'max_steps': 604800, 'data_path': 'gs://connors-datasets/openwebtext/', 'scale': 0.14433756729740646, 'res_dropout': 0.1, 'predict_batch_size': 1, 'eval_batch_size': 256, 'iterations': 100, 'n_embd': 1024, 'input': 'openwebtext_longbiased', 'model': 'GPT2', 'model_path': 'C:\GPT2\PrettyBig', 'n_ctx': 1024, 'predict_path': 'logs/predictions_SortaBig.txt', 'n_layer': 25, 'use_tpu': False, 'precision': 'float32'} Using config: {'_model_dir': 'C:\GPT2\PrettyBig', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000016DD33ECEB8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} Generating predictions... From C:\Python37\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. Calling model_fn. From C:\GPT2\models\gpt2\sample.py:57: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. From C:\GPT2\models\gpt2\sample.py:59: multinomial (from tensorflow.python.ops.random_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.random.categorical instead. Done calling model_fn. Graph was finalized. From C:\Python37\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. Restoring parameters from C:\GPT2\PrettyBig\model.ckpt Running local_init_op. Done running local_init_op. Traceback (most recent call last): File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call return fn(*args) File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,0] = 1024 is not in [0, 1024) [[{{node sample_sequence/while/model/GatherV2_1}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main.py", line 131, in predict_fn(network, text, params) File "C:\GPT2\predict_fns.py", line 18, in gpt2_predict for i, p in enumerate(predictions): File "C:\Python37\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 629, in predict preds_evaluated = mon_sess.run(predictions) File "C:\Python37\lib\site-packages\tensorflow\python\training\monitored_session.py", line 676, in run run_metadata=run_metadata) File "C:\Python37\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1171, in run run_metadata=run_metadata) File "C:\Python37\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1270, in run raise six.reraise(*original_exc_info) File "C:\Python37\lib\site-packages\six.py", line 693, in reraise raise value File "C:\Python37\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1255, in run return self._sess.run(*args, **kwargs) File "C:\Python37\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1327, in run run_metadata=run_metadata) File "C:\Python37\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1091, in run return self._sess.run(*args, **kwargs) File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 929, in run run_metadata_ptr) File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run run_metadata) File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,0] = 1024 is not in [0, 1024) [[node sample_sequence/while/model/GatherV2_1 (defined at C:\GPT2\models\gpt2\gpt2.py:208) ]]

Caused by op 'sample_sequence/while/model/GatherV2_1', defined at: File "main.py", line 131, in predict_fn(network, text, params) File "C:\GPT2\predict_fns.py", line 18, in gpt2_predict for i, p in enumerate(predictions): File "C:\Python37\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 611, in predict features, None, model_fn_lib.ModeKeys.PREDICT, self.config) File "C:\Python37\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1112, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "C:\GPT2\model_fns.py", line 62, in gpt2_model temperature=1.0, top_k=params["top_k"] File "C:\GPT2\models\gpt2\sample.py", line 82, in sample_sequence back_prop=False, File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3556, in while_loop return_same_structure) File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3087, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3022, in _BuildLoop body_result = body(*packed_vars_for_body) File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3525, in body = lambda i, lv: (i + 1, orig_body(*lv)) File "C:\GPT2\models\gpt2\sample.py", line 56, in body next_outputs = step(params, prev[:, tf.newaxis], past=past) File "C:\GPT2\models\gpt2\sample.py", line 40, in step lm_output = lm_output = gpt2.model(params=params, X=tokens, past=past, reuse=tf.AUTO_REUSE) File "C:\GPT2\models\gpt2\gpt2.py", line 208, in model h = tf.gather(wte, X) + tf.gather(wpe, positions_for(X, past_length)) File "C:\Python37\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "C:\Python37\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3273, in gather return gen_array_ops.gather_v2(params, indices, axis, name=name) File "C:\Python37\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 4390, in gather_v2 "GatherV2", params=params, indices=indices, axis=axis, name=name) File "C:\Python37\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "C:\Python37\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op op_def=op_def) File "C:\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[0,0] = 1024 is not in [0, 1024) [[node sample_sequence/while/model/GatherV2_1 (defined at C:\GPT2\models\gpt2\gpt2.py:208) ]]

I have used single text as author suggested but still fails

I have also tested input.txt method

FurkanGozukara avatar Jun 09 '19 12:06 FurkanGozukara

This is definitely strange and I will have to investigate it more carefully. I'm afraid I don't currently know a solution.

ConnorJL avatar Jun 11 '19 11:06 ConnorJL

This is definitely strange and I will have to investigate it more carefully. I'm afraid I don't currently know a solution.

thanks for the reply

Your trained model works in official GPT2 repository clone though >https://github.com/openai/gpt-2

I have cloned official repository and put your files there, and it works in their setup

FurkanGozukara avatar Jun 11 '19 12:06 FurkanGozukara