gpt-2 icon indicating copy to clipboard operation
gpt-2 copied to clipboard

checksum does not match error for 1558M model

Open mkcreviews opened this issue 5 years ago • 1 comments

Downloaded the master branch and 1558M model. Got checksum error.

Is there a way to bypass the checksum and keep working ?

Command given: python3 src/interactive_conditional_samples.py --model_name=1558M

Error in linux:

2020-05-23 23:57:34.717724: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Data loss: Checksum does not match: stored 1096252745 vs. calculated on the restored bytes 1479428755 Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.DataLossError: Checksum does not match: stored 1096252745 vs. calculated on the restored bytes 1479428755 [[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "src/generate_unconditional_samples.py", line 79, in fire.Fire(sample_model) File "/usr/local/lib/python3.7/dist-packages/fire/core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/usr/local/lib/python3.7/dist-packages/fire/core.py", line 468, in _Fire target=component.name) File "/usr/local/lib/python3.7/dist-packages/fire/core.py", line 672, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "src/generate_unconditional_samples.py", line 67, in sample_model saver.restore(sess, ckpt) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 1276, in restore {self.saver_def.filename_tensor_name: save_path}) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.DataLossError: Checksum does not match: stored 1096252745 vs. calculated on the restored bytes 1479428755 [[node save/RestoreV2 (defined at src/generate_unconditional_samples.py:65) ]]

Caused by op 'save/RestoreV2', defined at: File "src/generate_unconditional_samples.py", line 79, in fire.Fire(sample_model) File "/usr/local/lib/python3.7/dist-packages/fire/core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/usr/local/lib/python3.7/dist-packages/fire/core.py", line 468, in _Fire target=component.name) File "/usr/local/lib/python3.7/dist-packages/fire/core.py", line 672, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "src/generate_unconditional_samples.py", line 65, in sample_model saver = tf.train.Saver() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps restore_sequentially) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2 name=name) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

DataLossError (see above for traceback): Checksum does not match: stored 1096252745 vs. calculated on the restored bytes 1479428755 [[node save/RestoreV2 (defined at src/generate_unconditional_samples.py:65) ]]

mkcreviews avatar May 23 '20 18:05 mkcreviews

@mkcreviews Did you find a fix?

siddas27 avatar Aug 24 '20 14:08 siddas27