YAD2K How can you train this model on your own data?

Can you give a mini-tutorial on how to train this model and use it to predict bounding-boxes, on your own data. (given 20,000 images with bounding boxes, of objects not in ImageNet/coco). Just a few lines to get started. Do you have to first use the original Draknet code to train the model?

May 17 '17 14:05 drorhilman

See train_overfit.py for an example of how to train on a single image. Extending this to a larger dataset isn't difficult and basically just requires zero-padding the boxes tensor to some maximum number of boxes to allow for mini-batch training. E.g., if you set max boxes to 10 and one example has 4 boxes then the boxes tensor is (4,5) and you want to pad it to with zeros on the bottom so it's (10,5). That way all boxes tensors in a batch are (10,5).

May 17 '17 16:05 allanzelener

I did this and tried to train on my own dataset. I made no changes to number of anchors. However at the end of over fitting on one image, it predicts 0 boxes found. The value of total loss as computed by the yolo_loss function seems to decrease to 0.07 however the loss output by keras as the standard output does not go below 6.5. Any idea where I might have gone wrong?

`def _main(): class_names = ['T_Light', 'T_Sign'] anchors = YOLO_ANCHORS # Load Images

images_dir = '/yolo'
images_f = open('/ValidImages_Aachen.txt','r')
images = []
for line in images_f:
    line = (line.rstrip('\n').split(','))[0]
    img = cv2.imread(images_dir+line)
    orig_size = np.array([img.shape[1], img.shape[0]])
    img = cv2.resize(img, (416, 416), interpolation=cv2.INTER_CUBIC)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    image_data = np.array(img, dtype=np.float)
    image_data /= 255.
    images.append(image_data)
images_f.close()
orig_size = np.expand_dims(orig_size, axis=0)
print orig_size
images = np.asarray(images)
# Load Labels

labels_path = '/home/tito/Desktop/yolo/Coords/Cityscapes_BB_Aachen.pickle'
labels_f = open(labels_path,'rb')
boxes = []
detectors_mask_list = []
matching_true_boxes_list = []
for i in range(len(images)):
    box = np.asarray(pickle.load(labels_f))
    # Get box parameters as x_center, y_center, box_width, box_height, class.
    box_xy = 0.5 * (box[:, 3:5] + box[:, 1:3])
    box_wh = box[:, 3:5] - box[:, 1:3]
    box_xy = box_xy / ((orig_size)*1.0)
    box_wh = box_wh / ((orig_size)*1.0)
    box = np.concatenate((box_xy, box_wh, box[:, 0:1]), axis=1)
    
    if len(box) < 10:
        box = np.append(box, np.zeros((10-len(box), 5)), axis=0)
    
    boxes.append(box)
    

    # Precompute detectors_mask and matching_true_boxes for training.
    # Detectors mask is 1 for each spatial position in the final conv layer and
    # anchor that should be active for the given boxes and 0 otherwise.
    # Matching true boxes gives the regression targets for the ground truth box
    # that caused a detector to be active or 0 otherwise.
    detectors_mask_shape = (13, 13, 5, 1)
    matching_boxes_shape = (13, 13, 5, 5)
    detectors_mask, matching_true_boxes = preprocess_true_boxes(box, anchors, [416, 416])
    detectors_mask_list.append(detectors_mask)
    matching_true_boxes_list.append(matching_true_boxes)

# Create model input layers.
image_input = Input(shape=(416, 416, 3))
boxes_input = Input(shape=(None, 5))
detectors_mask_input = Input(shape=detectors_mask_shape)
matching_boxes_input = Input(shape=matching_boxes_shape)

boxes = np.asarray(boxes)
detectors_mask_list = np.asarray(detectors_mask_list)
matching_true_boxes_list = np.asarray(matching_true_boxes_list)

images = images[0:1]
boxes = boxes[0:1]
detectors_mask_list = detectors_mask_list[0:1]
matching_true_boxes_list = matching_true_boxes_list[0:1]

print "***************************************************************"
plt.imshow(images[0]); plt.show() 
print boxes[0]
print detectors_mask_list[0]
print matching_true_boxes_list[0]
print "***************************************************************"

print(images.shape)
print(boxes.shape)
print(detectors_mask_list.shape)
print(matching_true_boxes_list.shape)
#print(boxes_extents)
#prisnt(np.where(detectors_mask == 1)[:-1])
#print(matching_true_boxes[np.where(detectors_mask == 1)[:-1]])

# Create model body.
model_body = yolo_body(image_input, len(anchors), len(class_names))
model_body = Model(image_input, model_body.output)
# Place model loss on CPU to reduce GPU memory usage.
#model_body.load_weights('/home/tito/Traffic/YOLO/yad2k/model_data/yolo.h5')
#model_body.summary()


with tf.device('/cpu:0'):
    # TODO: Replace Lambda with custom Keras layer for loss.
    model_loss = Lambda(
        yolo_loss,
        output_shape=(1, ),
        name='yolo_loss',
        arguments={'anchors': anchors,
                   'num_classes': len(class_names)})([
                       model_body.output, boxes_input,
                       detectors_mask_input, matching_boxes_input
                   ])
model = Model(
    [image_input, boxes_input, detectors_mask_input,
     matching_boxes_input], model_loss)
model.compile(
    optimizer='adam', loss={
        'yolo_loss': lambda y_true, y_pred: y_pred
    })  # This is a hack to use the custom loss function in the last layer.

# Add batch dimension for training.
#image_data = np.expand_dims(image_data, axis=0)
#boxes = np.expand_dims(boxes, axis=0)
#detectors_mask = np.expand_dims(detectors_mask, axis=0)
#matching_true_boxes = np.expand_dims(matching_true_boxes, axis=0)
'''
num_steps = 200
# TODO: For full training, put preprocessing inside training loop.
# for i in range(num_steps):
#     loss = model.train_on_batch(
#         [image_data, boxes, detectors_mask, matching_true_boxes],
#         np.zeros(len(image_data)))
model.fit([images, boxes, detectors_mask_list, matching_true_boxes_list],
          np.zeros(len(images)),
          batch_size=1,
          epochs=num_steps)
model.save_weights('overfit_weights.h5')
'''
model.load_weights('overfit_weights.h5')
# Create output variables for prediction.
yolo_outputs = yolo_head(model_body.output, anchors, len(class_names))
input_image_shape = K.placeholder(shape=(2, ))
boxes, scores, classes = yolo_eval(yolo_outputs, input_image_shape, score_threshold=.3, iou_threshold=.9)

# Run prediction on overfit image.
sess = K.get_session()  # TODO: Remove dependence on Tensorflow session.
t = time.time()
out_boxes, out_scores, out_classes = sess.run(
    [boxes, scores, classes],
    feed_dict={
        model_body.input: images,
        input_image_shape: [416, 416],
        K.learning_phase(): 0
    })
print (time.time() - t)*1000
print('Found {} boxes for image.'.format(len(out_boxes)))
print(out_boxes)

# Plot image with predicted boxes.
image_with_boxes = draw_boxes(images[0], out_boxes, out_classes,
                              class_names, out_scores)
plt.imshow(image_with_boxes, interpolation='nearest')
plt.show()`

May 22 '17 11:05 titoghose

I found that the model needs to be trained for about 1000 steps to consistently fit to my single training image.

Haven't tried it but setting the score threshold to 0 will let you see if any low confidence boxes are being predicted.

On Mon, May 22, 2017, 7:33 AM titoghose [email protected] wrote:

I did this and tried to train on my own dataset. I made no changes to number of anchors. However at the end of over fitting on one image, it predicts 0 boxes found. Any idea where I might have gone wrong?

`def _main(): class_names = ['T_Light', 'T_Sign'] anchors = YOLO_ANCHORS

Load Images

images_dir = '/home/tito/Desktop/yolo' images_f = open('/home/tito/Desktop/yolo/ValidImages/ValidImages_Aachen.txt','r') images = [] for line in images_f: line = (line.rstrip('\n').split(','))[0] img = cv2.imread(images_dir+line) orig_size = np.array([img.shape[1], img.shape[0]]) img = cv2.resize(img, (416, 416), interpolation=cv2.INTER_CUBIC) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) image_data = np.array(img, dtype=np.float) image_data /= 255. images.append(image_data) images_f.close() orig_size = np.expand_dims(orig_size, axis=0) print orig_size images = np.asarray(images)

Load Labels

labels_path = '/home/tito/Desktop/yolo/Coords/Cityscapes_BB_Aachen.pickle' labels_f = open(labels_path,'rb') boxes = [] detectors_mask_list = [] matching_true_boxes_list = [] for i in range(len(images)): box = np.asarray(pickle.load(labels_f)) # Get box parameters as x_center, y_center, box_width, box_height, class. box_xy = 0.5 * (box[:, 3:5] + box[:, 1:3]) box_wh = box[:, 3:5] - box[:, 1:3] box_xy = box_xy / ((orig_size)*1.0) box_wh = box_wh / ((orig_size)*1.0) box = np.concatenate((box_xy, box_wh, box[:, 0:1]), axis=1)
if len(box) < 10:
    box = np.append(box, np.zeros((10-len(box), 5)), axis=0)

boxes.append(box)


# Precompute detectors_mask and matching_true_boxes for training.
# Detectors mask is 1 for each spatial position in the final conv layer and
# anchor that should be active for the given boxes and 0 otherwise.
# Matching true boxes gives the regression targets for the ground truth box
# that caused a detector to be active or 0 otherwise.
detectors_mask_shape = (13, 13, 5, 1)
matching_boxes_shape = (13, 13, 5, 5)
detectors_mask, matching_true_boxes = preprocess_true_boxes(box, anchors, [416, 416])
detectors_mask_list.append(detectors_mask)
matching_true_boxes_list.append(matching_true_boxes)
Create model input layers.

image_input = Input(shape=(416, 416, 3)) boxes_input = Input(shape=(None, 5)) detectors_mask_input = Input(shape=detectors_mask_shape) matching_boxes_input = Input(shape=matching_boxes_shape)

boxes = np.asarray(boxes) detectors_mask_list = np.asarray(detectors_mask_list) matching_true_boxes_list = np.asarray(matching_true_boxes_list)

images = images[0:1] boxes = boxes[0:1] detectors_mask_list = detectors_mask_list[0:1] matching_true_boxes_list = matching_true_boxes_list[0:1]

print "" plt.imshow(images[0]); plt.show() print boxes[0] print detectors_mask_list[0] print matching_true_boxes_list[0] print ""

print(images.shape) print(boxes.shape) print(detectors_mask_list.shape) print(matching_true_boxes_list.shape) #print(boxes_extents) #prisnt(np.where(detectors_mask == 1)[:-1]) #print(matching_true_boxes[np.where(detectors_mask == 1)[:-1]])

Create model body.

model_body = yolo_body(image_input, len(anchors), len(class_names)) model_body = Model(image_input, model_body.output)

Place model loss on CPU to reduce GPU memory usage.

#model_body.load_weights('/home/tito/Traffic/YOLO/yad2k/model_data/yolo.h5') #model_body.summary()

with tf.device('/cpu:0'): # TODO: Replace Lambda with custom Keras layer for loss. model_loss = Lambda( yolo_loss, output_shape=(1, ), name='yolo_loss', arguments={'anchors': anchors, 'num_classes': len(class_names)})([ model_body.output, boxes_input, detectors_mask_input, matching_boxes_input ]) model = Model( [image_input, boxes_input, detectors_mask_input, matching_boxes_input], model_loss) model.compile( optimizer='adam', loss={ 'yolo_loss': lambda y_true, y_pred: y_pred }) # This is a hack to use the custom loss function in the last layer.

Add batch dimension for training.

#image_data = np.expand_dims(image_data, axis=0) #boxes = np.expand_dims(boxes, axis=0) #detectors_mask = np.expand_dims(detectors_mask, axis=0) #matching_true_boxes = np.expand_dims(matching_true_boxes, axis=0) ''' num_steps = 200

TODO: For full training, put preprocessing inside training loop.

for i in range(num_steps):

loss = model.train_on_batch(

[image_data, boxes, detectors_mask, matching_true_boxes],

np.zeros(len(image_data)))

model.fit([images, boxes, detectors_mask_list, matching_true_boxes_list], np.zeros(len(images)), batch_size=1, epochs=num_steps) model.save_weights('overfit_weights.h5') ''' model.load_weights('overfit_weights.h5')

Create output variables for prediction.

yolo_outputs = yolo_head(model_body.output, anchors, len(class_names)) input_image_shape = K.placeholder(shape=(2, )) boxes, scores, classes = yolo_eval(yolo_outputs, input_image_shape, score_threshold=.3, iou_threshold=.9)

Run prediction on overfit image.

sess = K.get_session() # TODO: Remove dependence on Tensorflow session. t = time.time() out_boxes, out_scores, out_classes = sess.run( [boxes, scores, classes], feed_dict={ model_body.input: images, input_image_shape: [416, 416], K.learning_phase(): 0 }) print (time.time() - t)*1000 print('Found {} boxes for image.'.format(len(out_boxes))) print(out_boxes)

Plot image with predicted boxes.

image_with_boxes = draw_boxes(images[0], out_boxes, out_classes, class_names, out_scores) plt.imshow(image_with_boxes, interpolation='nearest') plt.show()`

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/allanzelener/YAD2K/issues/24#issuecomment-303074788, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVyfMPx1GaTftcQj8BKycuCpKOlz18Nks5r8XKcgaJpZM4Nd-Wm .

May 22 '17 14:05 allanzelener

I have tried to train it for 1000 epochs but the loss won't fall below 5. Thanks for your help. I'll try to figure out what the problem may be.

May 22 '17 19:05 titoghose

The loss probably won't fall much below 5 because the loss Keras reports includes the L2 regularization on the model's weights. But I have seen it get closer to about 2-3 on some minibatches. It should still be able to fit on a single image with a loss that high. I've added an option to yolo_loss to print each component of the yolo loss function. (It was doing this before by default but maybe it wasn't clear what it was.)

Also I just reverted a change where I set the target for the confidence of each detector to be the IOU of the predicted box with the closest ground truth box. This is what's done in YOLO when the rescore=True option is set, but I've found in my experiments this probably makes it take longer to converge to higher confidence since initial predictions will be bad. If you want this behavior then set rescore_confidence to True in the yolo_loss function.

May 23 '17 19:05 allanzelener

Thankyou I'll try retraining it on my dataset

May 24 '17 04:05 titoghose

A script for (re)training a full model was just contributed. I haven't tested it myself but see retrain_yolo.py for details.

May 24 '17 19:05 allanzelener

The script uses a numpy file to load training data.

You can see how I packaged my dataset at https://github.com/shadySource/DATA.

May 24 '17 19:05 alecGraves

@titoghose Did you successfully train model with your own dataset?

@allanzelener I tried to run train_overfit.py , and it produce overfit_weights.h5. However, I could not use this h5 file on ./test_yolo.py.

This is the error message of tying python3 test_yolo.py overfit_weights.h5

Traceback (most recent call last):
  File "test_yolo.py", line 199, in <module>
    _main(parser.parse_args())
  File "test_yolo.py", line 83, in _main
    yolo_model = load_model(model_path)
  File "/usr/local/lib/python3.5/dist-packages/keras/models.py", line 238, in load_model
    raise ValueError('No model found in config file.')
ValueError: No model found in config file.

It seems like the h5 file produce by train_overfit.py is lack of some information.

May 25 '17 10:05 HTLife

maybe it is a version issue (keras version or h5py version) ?

May 25 '17 12:05 drorhilman

@drorhilman Did you run test_yolo.py with your own h5 file successfully?
I'd like to know which version you use (both Keras and h5py).

By the way, I have no problem running ./test_yolo.py model_data/yolo.h5 this commend provided on Readme page.

May 25 '17 12:05 HTLife

I used python 3.5 with the last keras+tensorflow version (fresh install) and it seemed to work. currently I am working on a different machine, with python 2.7, and I will go back to try this soon. I will report if I will have issues. Anyway, I remeber having issues with old keras versions and h5, and also with moving from h5py on python 2 to h5py on python 3

May 25 '17 12:05 drorhilman

Still no luck! I retied with the following setting

python 3.5.2
keras 2.0.4
tensorflow 1.1.0
h5py 2.7.0
numpy 1.12.1

and got the same error message.

May 25 '17 13:05 HTLife

overfit_weights.h5 are just the weights of the model, not the model itself. I did it that way because there seems to be an issue with serializing the model with the loss function attached.

You need to first recreate the model in your script and then call model.load_weights('overfit_weights.h5').

May 25 '17 14:05 allanzelener

@shadySource I was actually following your link [https://github.com/shadySource/DATA] I ran python package_dataset.py without making any modification, it give me the following error


dhcp-6-143:DATA USER$ python package_dataset.py 
Traceback (most recent call last):
  File "package_dataset.py", line 55, in <module>
    img = np.array(PIL.Image.open(os.path.join('data', label[0][0], label[0][1])))
  File "/Users/USER/anaconda/lib/python2.7/site-packages/PIL/Image.py", line 2312, in open
    fp = builtins.open(filename, "rb")
TypeError: file() argument 1 must be encoded string without null bytes, not str

May 28 '17 07:05 roshanDD

@roshanDD I wrote the code in python3. I did not test it for python2.7. It could also be an issue with the version of PIL you are using.

May 30 '17 20:05 alecGraves

May I ask for the process of preparing the label data file. In the original c code, need to prepare in the format of " class id xmin xmax ymin ymax" ,and put the label txt file in the same folder with the image file, and the same name of image file.

Does your code (keras implementation) work different way?

Thanks

Jun 01 '17 06:06 chairath

@chairath The yolo_loss function and the preprocess_true_boxes function expect the ground truth boxes to be in the form "x_center, y_center, width, height, class" where these values have also been normalized by the image size. The Darknet C code also does this preprocessing.

Jun 01 '17 12:06 allanzelener

@allanzelener thanks for your quick response, sorry for misunderstanding, I will try again

Jun 02 '17 00:06 chairath

Hello! I was successful in retraining using retrain_yolo.py. However, I would like to retrain the tiny-yolo-voc model. Can you point me in the right direction? I've tried replacing the following 2 lines from def create_model(...) from retrain_yolo.py:

yolo_model = yolo_body(image_input, len(anchors), len(class_names)
topless_yolo = Model(yolo_model.input, yolo_model.layers[-2].output)

with:

yolo_model = load_model(os.path.join('model_data', 'tiny-yolo-voc.h5'))
topless_yolo = Model(yolo_model.input, yolo_model.layers[-2].output)

and I keep getting Graph disconnected errors. Based on this error I tried:

yolo_model = load_model(os.path.join('model_data', 'tiny-yolo-voc.h5'))
topless_yolo = Sequential()
for layer in yolo_model.layers[-1]:
    topless_yolo.add(layer)

but this says that LeakyReLU is not iterable.

I've also tried creating my own yolo_body() function based on the code from main() function of yad2k.py that takes the .cfg and .weights files and creates a model, but I still get Graph disconnected when I try to topless_yolo = Model(yolo_model.input, yolo_model.layers[-2].output) from it.

I'm not sure how to rewrite the yolo_body function to the simpler model of tiny-yolo-voc since I'm very new to this. Could you please help me?

Edit. I managed to solve this by writing up a new function, yolo_tiny(), to take the place of yolo_body() in the retrain_yolo.py script:

def yolo_tiny(inputs, num_anchors, num_classes):
    darknet = Model(inputs, compose(
        DarknetConv2D_BN_Leaky(16, (3, 3)),
        MaxPooling2D(),
        DarknetConv2D_BN_Leaky(32, (3, 3)),
        MaxPooling2D(),
        DarknetConv2D_BN_Leaky(64, (3, 3)),
        MaxPooling2D(),
        DarknetConv2D_BN_Leaky(128, (3, 3)),
        MaxPooling2D(),
        DarknetConv2D_BN_Leaky(256, (3, 3)),
        MaxPooling2D(),
        DarknetConv2D_BN_Leaky(512, (3, 3)),
        MaxPooling2D(padding='same',
                     pool_size=(2, 2),
                     strides=(1, 1)),
        DarknetConv2D_BN_Leaky(1024, (3, 3)),
        DarknetConv2D_BN_Leaky(1024, (3, 3)))(inputs))
    x = DarknetConv2D(num_anchors * (num_classes + 5), (1, 1))(darknet.output)
    return Model(inputs, x)

I figured it out by looking at the summary of the model returned by yolo_body(), comparing it to yolo.cfg and then using tiny-yolo-voc.cfg to create yolo_tiny(). Hope this will be useful for someone.

Jun 03 '17 23:06 altvali

@altvali @allanzelener Have you try to retrain on Pascal and get a similar result with benchmark? I am doing this and it seem that the val_loss cannot converge.

Jun 07 '17 02:06 Tangzy7

Hi !

It's my first time with deep learning and Keras. I try to used YAD2K for my own project. I already change the code to used web cam and video but now I’m trying to train YAD2K with my own data (images with annotation in the Pascal VOC format). I had some problem with retrain_yolo.py . Normally, I have converted my data in the correct format in a .npz with this script :

import os
import sys
import json
import glob
import PIL.Image
import numpy as np

from lxml import etree

debug = False #only load 10 images
shuffle = False # shuffle dataset

text = []
image_labels = []
path = '/home/work_application/FireDiamond/Annotation/'
for i,filename in enumerate(glob.glob(path+'*.xml')):   
	image_labels.append([])  
	image_labels[i].append(['FireDiamond',os.path.basename(filename).split('.')[0] + ".jpg"])
	tree = etree.parse(filename)
	root = tree.getroot()

	for j,object in enumerate(root.findall('object')) :

		boxConfig = []
		# Classe of the object
		boxConfig.append(0)

		box = object.find('bndbox')
		boxConfig.append(float(box.find('xmin').text))
		boxConfig.append(float(box.find('ymin').text))
		boxConfig.append(float(box.find('xmax').text))
		boxConfig.append(float(box.find('ymax').text))

		image_labels[i].append(boxConfig)


for i in image_labels :
	print(i)

# load images
images = []
for i, label in enumerate(image_labels):
	img = np.array(PIL.Image.open(os.path.join(label[0][0], label[0][1])).resize((640, 480)), dtype=np.uint8)
	print(img.shape)
	images.append(img)
	if debug and i == 9:
		break

#convert to numpy for saving
print(len(images))
images = np.array(images, dtype=np.uint8)
image_labels = [np.array(i[1:]) for i in image_labels]# remove the file names
image_labels = np.array(image_labels)

#shuffle dataset
if shuffle:
    np.random.seed(13)
    indices = np.arange(len(images))
    np.random.shuffle(indices)
    images, image_labels = images[indices], image_labels[indices]

#save dataset
np.savez("my_dataset", images=images, boxes=image_labels)
print('Data saved: my_dataset.npz')

But when I try to used retrain_yolo.py I have this message :

Traceback (most recent call last):
  File "retrain_yolo.py", line 345, in <module>
    _main(args)
  File "retrain_yolo.py", line 64, in _main
    detectors_mask, matching_true_boxes = get_detector_mask(boxes, anchors)
  File "retrain_yolo.py", line 158, in get_detector_mask
    detectors_mask[i], matching_true_boxes[i] = preprocess_true_boxes(box, anchors, [416, 416])
  File "/home/versalg/work_application/YAD2K/yad2k/models/keras_yolo.py", line 420, in preprocess_true_boxes
    detectors_mask[i, j, best_anchor] = 1
IndexError: index 15 is out of bounds for axis 0 with size 13

I think I have a problem with the conversion of my data or something like that but I don't understand what it is.

I have also some questions to ask you about YOLO and YAD2K : I don't understand well what are the anchor boxes... Are they the different shapes possible for a box in a cell ? What are the variables detectors_mask_shape = (13, 13, 5, 1) and matching_boxes_shape = (13, 13, 5, 5) ?

Excuse me if my questions are a bit naive, but this is the first time I use neural networks and because I’m not an English speaker I may have misunderstood information when I read the YOLO paper and your posts.

Thank you for your attention !

Jul 17 '17 13:07 Ahziel

@Ahziel Yeah, I think the issue is that the box labels you are using were made at a resolution other than 640x480. If all of the images have the same shape, you can just change the following line.

img = np.array(PIL.Image.open(os.path.join(label[0][0], label[0][1]))~~.resize((640, 480))~~, dtype=np.uint8)

If not, you can modify the data packaging script to resize the box labels. I only added the reshape because all of my labels were made at that resolution regardless of original image resolution.

Jul 17 '17 16:07 alecGraves

@shadySource Thank's, it was that. Now I can launch the training !

Jul 18 '17 08:07 Ahziel

Hi,

I face another issues with my own dataset. Before, I worked only with small dataset ( 100-200 pictures only ) but now, I have to use a dataset of 2000 pictures and when I use the script, it crashed only with the word Killed. So, I searched on internet and I tried some modifications but it still crashs, only the words change, now I have MemoryError.

Is that possible to use a large dataset with this script ? And if it's not, what can I change to solve this problem ?

Aug 04 '17 14:08 Ahziel

The issue is that you are loading all of those images into ram. I remember I had an issue loading 1000 images on a 16gb ram computer, and I was able to fix it by modifying the data preprocessing function to not keep the original images. I made this fix in retrain_yolo.py on the Theano Support branch with an open PR. There could still be too many images in the dataset, and fixing that will require modification of the retraining script. You could change how data is loaded, or you could split your dataset and train on 1/2 at a time.

For the future of this project, how the data is loaded should change to allow training on large datasets.

We should use the keras fit_generator() to grab images one batch at a time from the hard drive. When the images are loaded, we can also add noise to the images, and with some math we can add flips and rotations to the data (but this is just extra).

Also, with a large enough dataset, the model could be trained from random initialization (without using the pre-trained model at all).

On Aug 4, 2017 10:07 AM, "Ahziel" [email protected] wrote:

Hi, I face another issues with my own dataset. Before, I worked only with small dataset ( 100-200 pictures only ) but now, I have to use a dataset of 2000 pictures and when I use the script, it crashed only with the word Killed. So, I searched on internet and I tried some modifications but it still crashs, only the words change, now I have Memory error. Is that possible to use a large dataset with this script ? And if it's not, what can I change to solve this problem ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/allanzelener/YAD2K/issues/24#issuecomment-320258328, or mute the thread https://github.com/notifications/unsubscribe-auth/AOxEmJCJyQUn4R4xqP1THXO8nVMhf-twks5sUyW1gaJpZM4Nd-Wm .

Aug 04 '17 16:08 alecGraves

Hi ! Thank you @shadySource , I change to a fit_generator and now it work perfectly !

I have more theoretical questions about learning, I think I know the answer but I prefer to ask to spare me long hours of unnecessary learning.

Assuming that I already have a trained network to identify 10 objects, but that I want to add an eleventh. Do I have to start learning again from the beginning my new network, or is there a way to keep learning already done?

And, another question, if I have to learn a supplementary perspective for one of my objects, do I just have to get these new points of view through the network, or is it likely to change how the network responds to Other objects?

Here, I think that will be my last questions (well I hope). Thank you again for the help you have given me and if some need help for the fit_generator I can try to help you!

Aug 08 '17 07:08 Ahziel

You will probably need to remove the last layer since you are outpouring one more class. This means that you will need to train your network

Get Outlook for iOShttps://aka.ms/o0ukef

From: Ahziel [email protected] Sent: Tuesday, August 8, 2017 12:48:33 AM To: allanzelener/YAD2K Cc: roshanDD; Mention Subject: Re: [allanzelener/YAD2K] How can you train this model on your own data? (#24)

Hi ! Thank you @shadySourcehttps://github.com/shadysource , I change to a fit_generator and now it work perfectly !

I have more theoretical questions about learning, I think I know the answer but I prefer to ask to spare me long hours of unnecessary learning.

Assuming that I already have a trained network to identify 10 objects, but that I want to add an eleventh. Do I have to start learning again from the beginning my new network, or is there a way to keep learning already done?

And, another question, if I have to learn a supplementary perspective for one of my objects, do I just have to get these new points of view through the network, or is it likely to change how the network responds to Other objects?

Here, I think that will be my last questions (well I hope). Thank you again for the help you have given me and if some need help for the fit_generator I can try to help you!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/allanzelener/YAD2K/issues/24#issuecomment-320878964, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ANfNW8CPyxPDRyBKCqlnCruvxgCjsvtPks5sWBLRgaJpZM4Nd-Wm.

Aug 08 '17 08:08 roshanDD

Hi all! I used retrain_yolo script. Retrain proccess completed successfuly, but when I tried to load trained_stage_3_best.h5 using load_model method it throws: raise ValueError('No model found in config file.') ValueError: No model found in config file.

Aug 16 '17 08:08 Firyuza

Hello @Firyuza . I'm not sure but I think it's because trained_stage_3_best.h5 is not a model but just the weights of the model. So if you want to use it you need to use the function load_weights instead.

Aug 16 '17 14:08 Ahziel

YAD2K YAD2K copied to clipboard

How can you train this model on your own data?

Load Images

Load Labels

Create model input layers.

Create model body.

Place model loss on CPU to reduce GPU memory usage.

Add batch dimension for training.

TODO: For full training, put preprocessing inside training loop.

for i in range(num_steps):

loss = model.train_on_batch(

[image_data, boxes, detectors_mask, matching_true_boxes],

np.zeros(len(image_data)))

Create output variables for prediction.

Run prediction on overfit image.

Plot image with predicted boxes.

YAD2K
YAD2K copied to clipboard