PENet_ICRA2021
PENet_ICRA2021 copied to clipboard
input images with standard size like 720p
Thank you for your amazing work!
I have an interest to use the ENet depth completion model. It is working very well with the KITTI database, but when I try to feed it with images of different size (640x360) instead of (1216x352) I am facing this error:
RuntimeError: Given groups=1, weight of size [32, 4, 5, 5], expected input[1, 5, 360, 640] to have 4 channels, but got 5 channels instead.
Is the input layer dedicated to this form factor? What is your recommendation to adapt to input image of size 360x640 or 720p?
Thanks for your interest! The input data is down-sampled 5 times so both the height and width must be integer multiple of 2^5=32. So you could crop the width to 352 pixels or pad it to 384 pixels.
Thank you for your answer! I did follow your advice and center crop height down to 352 pixels for both inputs (rgb and depth). As there was still some remaining issues with an error mentioning 'channels', I checked the mode of my color image input and I was indeed in "RGBA". After converting the color image from "RGBA" to "RGB", the code finally worked normally.
from PIL import Image
def crop_center(pil_img, crop_width, crop_height):
img_width, img_height = pil_img.size
return pil_img.crop(((img_width - crop_width) // 2,
(img_height - crop_height) // 2,
(img_width + crop_width) // 2,
(img_height + crop_height) // 2))
def resize_image(image_path):
im = Image.open(image_path)
im_new = crop_center(im, 32*int(im.size[0]/32), 32*int(im.size[1]/32))
if (im_new.mode == 'RGBA'):
im_new = im_new.convert('RGB')
im_new.save(image_path, quality=95)