pytorch-nested-unet icon indicating copy to clipboard operation
pytorch-nested-unet copied to clipboard

Getting "ValueError: With n_samples=0... the resulting train set will be empty." even after seemingly appending datasets properly.

Open JacobOfCorns opened this issue 5 years ago • 1 comments

I'm training a CNN and it looks like the program reads my dataset properly.

i = 0 
for f, breed in tqdm(df_train.values):
    if type(cv2.imread('train_{}.jpeg'.format(f)))==type(None):
        continue
    else:
        img = cv2.imread('train_{}.jpeg'.format(f))
        label = one_hot_labels[i]
        x_train.append(cv2.resize(img, (im_size1, im_size2)))
        y_train.append(label)
        i += 1
np.save('x_train2',x_train)
np.save('y_train2',y_train)
print('Done')

I then get the output which appended 35,126 images 100%|█████████████████████████████████████████████████████████████████████████| 35126/35126 [00:01<00:00, 25629.80it/s] Done I then change the x_train and y_train to

y_train_raw = np.array(y_train, np.uint8)
x_train_raw = np.array(x_train, np.float32) / 255.

But when printing the shape it returns 0.

print(x_train_raw.shape)
print(y_train_raw.shape)

(0,) (0,) Finally, when I try to split the dataset by calling this numpy...

X_train, X_valid, Y_train, Y_valid = train_test_split(x_train_raw, y_train_raw, test_size=0.1, random_state=1)

I get the error ValueError: With n_samples=0, test_size=0.1 and train_size=0.9, the resulting train set will be empty. Adjust any of the aforementioned parameters.

Is it because I didn't append the dataset properly? Or did I not convert the images into tensors? If so, what would be the proper way to convert them into tensors?

JacobOfCorns avatar Jul 06 '20 16:07 JacobOfCorns

could it be that not converting the image to grayscale be one of the reasons why it doesn't append properly? I don't have a function in my code which converts the image to grayscale.

JacobOfCorns avatar Jul 06 '20 16:07 JacobOfCorns