Youtube-8M-WILLOW icon indicating copy to clipboard operation
Youtube-8M-WILLOW copied to clipboard

InvalidArgumentError: Name: , Context feature 'video_id' is required but could not be found.

Open chenboheng opened this issue 6 years ago • 16 comments

I download the 1/100 frame level features and run the train.py code. However, the follow wrong codes are obtained:

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Name: , Context feature 'video_id' is required but could not be found. [[Node: train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample = ParseSingleSequenceExample[Ncontext_dense=1, Ncontext_sparse=1, Nfeature_list_dense=1, Nfeature_list_sparse=0, Tcontext_dense=[DT_STRING], context_dense_shapes=[[]], context_sparse_types=[DT_INT64], feature_list_dense_shapes=[[]], feature_list_dense_types=[DT_STRING], feature_list_sparse_types=[], _device="/job:localhost/replica:0/task:0/cpu:0"](train_input/ReaderReadV2_2:1, train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample/feature_list_dense_missing_assumed_empty, train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample/context_sparse_keys_0, train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample/context_dense_keys_0, train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample/feature_list_dense_keys_0, train_input/ParseSingleSequenceExample_2/Const, train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample/debug_name)]] [[Node: train_input/shuffle_batch_join/cond_2/random_shuffle_queue_EnqueueMany/_98 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_90_train_input/shuffle_batch_join/cond_2/random_shuffle_queue_EnqueueMany", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Caused by op u'train_input/ParseSingleSequenceExample_2/ParseSingleSequenceExample', defined at: File "train.py", line 638, in app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "train.py", line 626, in main FLAGS.export_model_steps).run(start_new_model=FLAGS.start_new_model) File "train.py", line 353, in run saver = self.build_model(self.model, self.reader) File "train.py", line 524, in build_model num_epochs=FLAGS.num_epochs) File "train.py", line 236, in build_graph num_epochs=num_epochs)) File "train.py", line 164, in get_input_data_tensors reader.prepare_reader(filename_queue) for _ in range(num_readers) File "/media/ResearchProject/deeplearning/code/Youtube-8M-WILLOW/readers.py", line 212, in prepare_reader max_quantized_value, min_quantized_value) File "/media/ResearchProject/deeplearning/code/Youtube-8M-WILLOW/readers.py", line 224, in prepare_serialized_examples for feature_name in self.feature_names File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/parsing_ops.py", line 780, in parse_single_sequence_example feature_list_dense_defaults, example_name, name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/parsing_ops.py", line 977, in _parse_single_sequence_example_raw name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_parsing_ops.py", line 287, in _parse_single_sequence_example name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2395, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1264, in init self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Name: , Context feature 'video_id' is required but could not be found.

How can I solve this problem?

chenboheng avatar Jul 08 '18 03:07 chenboheng

it's because the devs changed their code according to GDPR compliance and there is a version mismatch of the code and your data

estathop avatar Jul 09 '18 11:07 estathop

Hi , I met the same issue. I solved it by changing the readers.py in willow's directory back to that in the starter code directory. Then training works. Check readers.py, you can find some difference.

wenching33 avatar Aug 24 '18 08:08 wenching33

it's because the devs changed their code according to GDPR compliance and there is a version mismatch of the code and your data

What can be done to resolve this issue?

punitagrawal32 avatar Sep 20 '18 11:09 punitagrawal32

Hi , I met the same issue. I solved it by changing the readers.py in willow's directory back to that in the starter code directory. Then training works. Check readers.py, you can find some difference.

Still does not work for me. I copy pasted the readers.py code in the starter code to the readers.py code in willow's directory. Still does not work for me.

punitagrawal32 avatar Sep 20 '18 11:09 punitagrawal32

@punitagrawal32 you just need to have the dataset and starter code of the previous year, you need an older version, but still if you don't have downloaded the whole Frame and Video Level features and put them in the correct folder it asks you will encounter another problem that happened to lots of people including me

estathop avatar Sep 20 '18 11:09 estathop

@estathop I have not been able to download the version 1 data using the download.py script (using curl).

curl data.yt8m.org/download.py | shard=1,1000 partition=1/frame/train mirror=asia python This is the code I have been trying to use, and urllib.error.HTTPError: HTTP Error 403: Forbidden this is the error I am encountering.

Perfectly able to download version 2 though ( changing partition=2 in the code).

Do you know why this might be happening?

I have manually downloaded the older version data, and trying to run the NetVLAD model on the 'audio' features only. I have been advised to modify the code in the 'frame_level_models.py' script, so that it can accept only audio features, and not BOTH (audio and video). Could you suggest the changes?

punitagrawal32 avatar Sep 21 '18 04:09 punitagrawal32

@estathop Also, I realize I need to run the models on the newer data. Is there a way I can run @antoine77340 's code on the newer data (say the GRU model), such that I do not get the error? Thanks in advance.

punitagrawal32 avatar Sep 21 '18 05:09 punitagrawal32

You need to modify his code heavily, personally I quitted trying to make it work

estathop avatar Sep 21 '18 07:09 estathop

@punitagrawal32 About the original "Context feature 'video_id' is required but could not be found" problem, I solved it by changing "video_id" in readers.py to "id". As for the change of input data channels, you have to modify frame_level_models.py. In my example, I only want video input. Audio input is removed by commenting out line 667,668 and 670. Line 671 is added. This is an example in NetVLAD. For other models you can do the similar change. 664 with tf.variable_scope("video_VLAD"): 665 vlad_video = video_NetVLAD.forward(reshaped_input[:,0:1024]) 666 667 # with tf.variable_scope("audio_VLAD"): 668 # vlad_audio = audio_NetVLAD.forward(reshaped_input[:,1024:]) 669 670 # vlad = tf.concat([vlad_video, vlad_audio],1) 671 vlad = tf.concat([vlad_video],1)

wenching33 avatar Sep 28 '18 02:09 wenching33

@wenching33 Thank you for the reply.

I need the audio features only. Thus, I have commented out lines 664 and 665,and replaced vlad_video with vlad_audio in line 671. This generates the following error:

Traceback (most recent call last): File "train.py", line 670, in <module> app.run() File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "train.py", line 657, in main reader=reader) File "/home/mlai/Documents/Punit/Projects/google_audioset/dataset_audio_full/googleaudio_files/export_model.py", line 35, in __init__ self.inputs, self.outputs = self.build_inputs_and_outputs() File "/home/mlai/Documents/Punit/Projects/google_audioset/dataset_audio_full/googleaudio_files/export_model.py", line 69, in build_inputs_and_outputs dtype=(tf.string, tf.int32, tf.float32))) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 459, in map_fn maximum_iterations=n) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3232, in while_loop return_same_structure) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2952, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2887, in _BuildLoop body_result = body(*packed_vars_for_body) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3201, in <lambda> body = lambda i, lv: (i + 1, orig_body(*lv)) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 448, in compute packed_fn_values = fn(packed_values) File "/home/mlai/Documents/Punit/Projects/google_audioset/dataset_audio_full/googleaudio_files/export_model.py", line 66, in <lambda> fn = lambda x: self.build_prediction_graph(x) File "/home/mlai/Documents/Punit/Projects/google_audioset/dataset_audio_full/googleaudio_files/export_model.py", line 100, in build_prediction_graph is_training=False) File "/home/mlai/Documents/Punit/Projects/google_audioset/dataset_audio_full/googleaudio_files/frame_level_models.py", line 668, in create_model vlad_audio = audio_NetVLAD.forward(reshaped_input[:,1024:]) File "/home/mlai/Documents/Punit/Projects/google_audioset/dataset_audio_full/googleaudio_files/frame_level_models.py", line 199, in forward activation = tf.matmul(reshaped_input, cluster_weights) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 2018, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4456, in mat_mul name=name) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op op_def=op_def) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1731, in __init__ control_input_ops) File "/home/mlai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1579, in _create_c_op raise ValueError(str(e)) **ValueError: Dimensions must be equal, but are 0 and 128 for 'map/while/tower/audio_VLAD/MatMul' (op: 'MatMul') with input shapes: [300,0], [128,128]**.

On looking it up, it seems to have worked for you because video inputs range from 0:1024, whereas the code in the audio input ranges from 1024: (to the end). But since I do not have video features, this is throwing the error.

I tried changing the code to:

vlad_audio = audio_NetVLAD.forward(reshaped_input[:,:])

and to:

vlad_audio = audio_NetVLAD.forward(reshaped_input[:, 0:])

but now it throws the error:

TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

Do you know what might be causing the issue? Thankful for your help.

Regards, Punit.

punitagrawal32 avatar Sep 28 '18 11:09 punitagrawal32

@wenching33 awaiting your response...

Thanks in advance.

punitagrawal32 avatar Oct 01 '18 06:10 punitagrawal32

You need to modify his code heavily, personally I quitted trying to make it work

@estathop do you have it work? thanks!

chendengshuai avatar Oct 25 '18 03:10 chendengshuai

@chenboheng If you want just a state-of-the-art classifier on Youtube8M you can check this year's winner who has everything working according to the Version 2. All top-5 last year's entries didn't work for me so when I needed just a classifier I used google's source code to train a mixture of Experts on Video level features

estathop avatar Oct 29 '18 11:10 estathop

@punitagrawal32 Sorry, I didn't check my message for long time. Your error message looks like type error. I actually modify WILLOW's code heavily to make it work for my own purpose. There will be lots of problem you have to solve all the way. I'm afraid I can't help you for each problem you will meet.

wenching33 avatar Oct 30 '18 07:10 wenching33

you can change 'video_ id' to 'id'. in the readers.py. it work. because, in extract_tfrecords_main.py of youtube-8m, flags.DEFINE_string('video_file_feature_key', 'id', 'Input <video_file> will be written to context feature ' 'with this key, as bytes list feature, with only one ' 'entry, containing the file path of the video. This ' 'can be used for debugging but not for training or eval.'), video_file_feature_key setting is ‘id’

sunzuoxiao avatar Nov 15 '18 07:11 sunzuoxiao

@punitagrawal32

I tried changing the code to: vlad_audio = audio_NetVLAD.forward(reshaped_input[:,:]) and to: vlad_audio = audio_NetVLAD.forward(reshaped_input[:, 0:]) but now it throws the error: TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64 Do you know what might be causing the issue? Thankful for your help. Regards, Punit.

If you use python3 instead of python2, then you can try two things as below

  1. use python2, not python3
  2. or use python3 after fix source code
--- a/frame_level_models.py
+++ b/frame_level_models.py
@@ -644,13 +644,13 @@ class NetVLADModelLF(models.BaseModel):
 
     if lightvlad:
       video_NetVLAD = LightVLAD(1024,max_frames,cluster_size, add_batch_norm, is_training)
-      audio_NetVLAD = LightVLAD(128,max_frames,cluster_size/2, add_batch_norm, is_training)
+      audio_NetVLAD = LightVLAD(128,max_frames,cluster_size//2, add_batch_norm, is_training)
     elif vlagd:
       video_NetVLAD = NetVLAGD(1024,max_frames,cluster_size, add_batch_norm, is_training)
-      audio_NetVLAD = NetVLAGD(128,max_frames,cluster_size/2, add_batch_norm, is_training)
+      audio_NetVLAD = NetVLAGD(128,max_frames,cluster_size//2, add_batch_norm, is_training)
     else:
       video_NetVLAD = NetVLAD(1024,max_frames,cluster_size, add_batch_norm, is_training)
-      audio_NetVLAD = NetVLAD(128,max_frames,cluster_size/2, add_batch_norm, is_training)
+      audio_NetVLAD = NetVLAD(128,max_frames,cluster_size//2, add_batch_norm, is_training)
 
   
     if add_batch_norm:# and not lightvlad:

I simply figured out this issue by testing like as below

Python 2.7.15 (default, Oct  2 2018, 11:47:18)
[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 256/2
128


Python 3.7.3 (default, Mar 27 2019, 09:23:39)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 256/2
128.0
>>> 256//2
128

This also helps you https://stackoverflow.com/questions/1535596/what-is-the-reason-for-having-in-python

ssokjin avatar Jun 05 '19 05:06 ssokjin