mildnet icon indicating copy to clipboard operation
mildnet copied to clipboard

Colab problem with metadata

Open batrlatom opened this issue 5 years ago • 7 comments

Hello, I am having a problem to run a training step in the colab and the same problem is even when I run your code localy with google cloud sdk authenticated. I am not using hyperdash token.

Do you know what could be a problem?

2019-07-13 09:30:43.978054: E tensorflow/core/platform/cloud/curl_http_request.cc:596] The transmission of request 0x564e8ccb2900 (URI: http://metadata/computeMetadata/v1/instance/service-accounts/default/token) has been stuck at 0 of 0 bytes for 61 seconds and will be aborted. CURL timing information: lookup time: 0.006321 (No error), connect time: 0 (No error), pre-transfer time: 0 (No error), start-transfer time: 0 (No error)

2019-07-13 09:30:43.978267: I tensorflow/core/platform/cloud/retrying_utils.cc:77] The operation failed and will be automatically retried in 1.23757 seconds (attempt 1 out of 10), caused by: Unavailable: Error executing an HTTP request: libcurl code 42 meaning 'Operation was aborted by an application callback', error details: Callback aborted

batrlatom avatar Jul 13 '19 09:07 batrlatom

Any updates here?

samehraban avatar Aug 01 '19 11:08 samehraban

I just gave up and run it on local system. I ommited the whole google sdk cloud thing. But the mildnet itself is nice and works quite well. Noting that there are some bugs in the code .. I think that right code for mildnet_mobilenet is :

def mildnet_mobilenet():
    vgg_model = MobileNet(weights=None, include_top=False, input_shape=(224,224,3))
    intermediate_layer_outputs = get_layers_output_by_name(vgg_model, ["conv_dw_1_relu", "conv_dw_2_relu", "conv_dw_4_relu", "conv_dw_6_relu", "conv_dw_12_relu"])
    convnet_output = GlobalAveragePooling2D()(vgg_model.output)
    for layer_name, output in intermediate_layer_outputs.items():
      output = GlobalAveragePooling2D()(output)
      convnet_output = concatenate([convnet_output, output])
    convnet_output = Dense(1024, activation='relu')(convnet_output)
    convnet_output = Dropout(0.5)(convnet_output)
    convnet_output = Dense(1024, activation='relu')(convnet_output)
    convnet_output = Lambda(lambda  x: K.l2_normalize(x,axis=1))(convnet_output)
      
    first_input = Input(shape=(224,224,3))
    second_input = Input(shape=(224,224,3))

    final_model = tf.keras.models.Model(inputs=[first_input, second_input, vgg_model.input], outputs=convnet_output)

    return final_model

batrlatom avatar Aug 02 '19 06:08 batrlatom

I'm working on the code -especially datagen module- making it compatible with latest version of tensorflow.

One thing that occurred to me was 3 different inputs. Can you confirm that these three inputs are actually needed in model? I couldn't find the code using these different inputs.

samehraban avatar Aug 03 '19 14:08 samehraban

Would you be able to share your latest tf compatible version with us? Btw, I am using this repo and it is nice source of inspiration - https://github.com/omoindrot/tensorflow-triplet-loss . And with the model, I also did not find any code which use them. btw ... did you try to learn tripletloss on cropped data? Do you have any experience with it?

batrlatom avatar Aug 05 '19 16:08 batrlatom

Well you can see what I've done in my fork. Still needs pretty much work I think but training the mildnet in google colab is now possible.

samehraban avatar Aug 06 '19 09:08 samehraban

I tried your code, looks nice. But still getting me error out of blue. Could you please take a look?


/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/training_utils.pyc in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    383                              ': expected ' + names[i] + ' to have shape ' +
    384                              str(shape) + ' but got array with shape ' +
--> 385                              str(data_shape))
    386   return data
    387 

ValueError: Error when checking input: expected input_1 to have shape (224, 224, 3) but got array with shape (224, 3, 3)

batrlatom avatar Aug 09 '19 09:08 batrlatom

looks like the line: img_width, img_height = [int(v) for v in model.input[0].shape[1:3]] should be img_width, img_height = [int(v) for v in model.input[0].shape[:2]]

StevenShadman avatar Aug 13 '19 16:08 StevenShadman