gpt-2-simple
gpt-2-simple copied to clipboard
Contine training with checkpoint loaded from Google Drive failed
gpt2.mount_gdrive()
gpt2.copy_checkpoint_from_gdrive("gpt2_medium_run1")
sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess, run_name='run1')
gpt2.finetune(
sess,
dataset=file_path,
steps=500,
print_every=10,
sample_every=200,
save_every=500,
overwrite=True
)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-b53c2e790190> in <module>()
6 sample_every=200,
7 save_every=500,
----> 8 overwrite=True
9 )
6 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource, constraint, synchronization, aggregation)
862 tb = [x for x in tb if "tensorflow/python" not in x[0]][:5]
863 raise ValueError("%s Originally defined at:\n\n%s" %
--> 864 (err_msg, "".join(traceback.format_list(tb))))
865 found_var = self._vars[name]
866 if not shape.is_compatible_with(found_var.get_shape()):
ValueError: Variable model/wpe already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py", line 183, in model
initializer=tf.compat.v1.random_normal_initializer(stddev=0.01))
File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py", line 345, in load_gpt2
output = model.model(hparams=hparams, X=context)
File "<ipython-input-3-f81553695c16>", line 4, in <module>
gpt2.load_gpt2(sess, run_name='run1')
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes
if self.run_code(code, result):
I have tried method from 80 but it's not helping.
Isn't there a discrepancy between your two "run names"?
Moreover, what happens if you leave overwrite
to its default value (False
)?
Isn't there a discrepancy between your two "run names"?
Moreover, what happens if you leave
overwrite
to its default value (False
)?
Thanks for your reply. But the same error happened regardless of the value of overwrite
, and gpt2.generate()
works fine with that checkpoint. I trained few models so can't use the default checkpoint filename.
any updates on this issue? I am having the same one.
any updates on this issue? I am having the same one.
Do sess= gpt2.reset_session(sess=sess)
before running finetune.
That line did not work for me in collab. Sharing the code:
gpt2.mount_gdrive()
file_name = "trainset.txt"
gpt2.copy_file_from_gdrive(file_name)
gpt2.copy_checkpoint_from_gdrive(run_name='model-3.0')
sess= gpt2.start_tf_sess(threads=4)
gpt2.load_gpt2(sess, run_name='model-3.0')
sess=gpt2.reset_session(sess=sess)
gpt2.finetune(
sess,
dataset=file_name,
steps=1000,
print_every=10,
multi_gpu=True,
learning_rate=0.002,
sample_every=200,
save_every=500,
overwrite=True
)
You need to download the GPT-2 model first via download_gpt2()
FileNotFoundError Traceback (most recent call last)
<ipython-input-9-77b3dd4c4586> in <module>()
8 sample_every=200,
9 save_every=500,
---> 10 overwrite=True
11 )
/usr/lib/python3.6/shutil.py in copyfile(src, dst, follow_symlinks)
118 os.symlink(os.readlink(src), dst)
119 else:
--> 120 with open(src, 'rb') as fsrc:
121 with open(dst, 'wb') as fdst:
122 copyfileobj(fsrc, fdst)
FileNotFoundError: [Errno 2] No such file or directory: 'models/124M/hparams.json'
You need to download the GPT-2 model first via download_gpt2()
I think the answer is right there for you..
Thanks zacc. Question then is.... the model should not be loaded after doing ?
gpt2.load_gpt2(sess, run_name='model-3.0')
My worry with calling to:
def download_gpt2(model_dir='models', model_name='124M')
is that you need to give the model_name and that is going to download the pretrained model from Google Cloud, but I don´t want to use the pretrained model but the finetuned model that I saved at my googledrive with the checkpoint. Maybe both actions are compatible and my fear is for nothing. Do you mean that I should call first:
gpt2.download_gpt2(model_name='124M')
and afterwards call
gpt2.load_gpt2(sess, run_name='model-3.0')
and that is fine?
Yes, you still need to download_gpt2 even though you are training your saved model.