keras icon indicating copy to clipboard operation
keras copied to clipboard

RuntimeError: stack expects each tensor to be equal size, but got [5898240] at entry 0 and [17694720] at entry 1 - MeanIoU

Open CalinLucian opened this issue 1 year ago • 1 comments

Following these 2 tutorials (mixing data fetching + model from DVL3+ (keras_cv)

Tutorial 1 : https://keras.io/examples/vision/oxford_pets_image_segmentation/ Tutorial 2 : https://keras.io/examples/vision/deeplabv3_plus/

In tutorial 2, mean_iou computation proceeds with no issues.

I attach my code below where I did the processing and the training.

I get that 5898240 * 3(nb_of_classes) == 17694720 , but I do not understand what to modify at my code in order for it to work. If I delete the keras.metrics.MeanIoU(num_classes=3), the training proceeds successfully.

keras=3.0.1
keras-cv==0.7.2
torch==2.1.1+cu118
IMAGE_WIDTH = 640
IMAGE_HEIGHT = 384
BATCH_SIZE = 24

#berkley dataset, current lane, alternate lane, background
NUM_CLASSES = 3

DATA_DIR = "./drivable_maps"
NUM_TRAIN_IMAGES = 70000
NUM_VAL_IMAGES = 10000

train_images = sorted(glob(os.path.join(DATA_DIR, "images/training/*")))[:NUM_TRAIN_IMAGES]
train_masks = sorted(glob(os.path.join(DATA_DIR, "annotations/training/*")))[:NUM_TRAIN_IMAGES]
val_images = sorted(glob(os.path.join(DATA_DIR, "images/validation/*")))[
    :NUM_VAL_IMAGES
]
val_masks = sorted(glob(os.path.join(DATA_DIR, "annotations/validation/*")))[
    :NUM_VAL_IMAGES
]

print(len(train_images))
print(len(train_masks))
print(len(val_images))

def read_image(image_path, mask=False):
    image = tf_io.read_file(image_path)
    if mask:
        image = tf_image.decode_png(image, channels=1)
        image.set_shape([None, None, 1])
        image = tf_image.resize(images=image, size=[IMAGE_HEIGHT, IMAGE_WIDTH])
    else:
        image = tf_image.decode_jpeg(image, channels=3)
        image.set_shape([None, None, 3])
        image = tf_image.resize(images=image, size=[IMAGE_HEIGHT, IMAGE_WIDTH])
    return image


def load_data(image_list, mask_list):
    image = read_image(image_list)
    mask = read_image(mask_list, mask=True)
    return image, mask


def data_generator(image_list, mask_list):
    dataset = tf_data.Dataset.from_tensor_slices((image_list, mask_list))
    dataset = dataset.map(load_data, num_parallel_calls=tf_data.AUTOTUNE)
    dataset = dataset.batch(BATCH_SIZE, drop_remainder=True)
    return dataset


train_dataset = data_generator(train_images, train_masks)
val_dataset = data_generator(val_images, val_masks)

model = keras_cv.models.DeepLabV3Plus.from_preset(
    "mobilenet_v3_small_imagenet", num_classes=NUM_CLASSES,load_weights=True
)



model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.0001),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    metrics=[keras.metrics.MeanIoU(num_classes=3),keras.metrics.SparseCategoricalAccuracy()]
)
history = model.fit(train_dataset, validation_data=val_dataset, epochs=EPOCHS,verbose=1)

ERROR thrown:

RuntimeError: stack expects each tensor to be equal size, but got [5898240] at entry 0 and [17694720] at entry 1

CalinLucian avatar Dec 13 '23 23:12 CalinLucian

Hi,

Thanks for reporting the issue. To narrow down the issue, could you please check if you are observing same behavior in TensorFlow backend as well?

sachinprasadhs avatar Jan 03 '24 19:01 sachinprasadhs

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Apr 26 '24 01:04 github-actions[bot]

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

github-actions[bot] avatar May 10 '24 01:05 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar May 10 '24 01:05 google-ml-butler[bot]