torchxrayvision icon indicating copy to clipboard operation
torchxrayvision copied to clipboard

Any dataset tool to transform original big .jpg into small one

Open catfish132 opened this issue 3 years ago • 2 comments

I have download MIMIC-CXR dataset and I want to train a model by myself . But the original images are too big,the total volume is 500GB. Disk IO will be a bottleneck. So is there any script to transform the images into small ones?

catfish132 avatar Oct 30 '22 03:10 catfish132

Yes I used this script for resizing. It uses the utility convert to do the resizing.

https://github.com/mlmed/torchxrayvision/blob/master/scripts/convert-single.sh

The convert-all.sh script will iterate over each image and call the single script using multiple threads.

ieee8023 avatar Oct 30 '22 07:10 ieee8023

@catfish132

from PIL import Image import os

input_folder = '/path/to/original/images' output_folder = '/path/to/resized/images' target_size = (512, 512) # set your desired target size

if not os.path.exists(output_folder): os.makedirs(output_folder)

for file_name in os.listdir(input_folder): # open image file with Image.open(os.path.join(input_folder, file_name)) as img: # resize the image while maintaining aspect ratio img.thumbnail(target_size, Image.ANTIALIAS) # save the resized image img.save(os.path.join(output_folder, file_name))

This code will resize all the images in the input_folder to the target_size and save them in the output_folder. You can modify the target_size to match your requirements.

Note that resizing the images may result in a loss of some image details, depending on the degree of resizing. Therefore, you may need to experiment with different target sizes to find the optimal balance between image size and image quality for your specific use case.

Adesoji1 avatar Apr 05 '23 11:04 Adesoji1