torchxrayvision
torchxrayvision copied to clipboard
Any dataset tool to transform original big .jpg into small one
I have download MIMIC-CXR dataset and I want to train a model by myself . But the original images are too big,the total volume is 500GB. Disk IO will be a bottleneck. So is there any script to transform the images into small ones?
Yes I used this script for resizing. It uses the utility convert to do the resizing.
https://github.com/mlmed/torchxrayvision/blob/master/scripts/convert-single.sh
The convert-all.sh script will iterate over each image and call the single script using multiple threads.
@catfish132
from PIL import Image import os
input_folder = '/path/to/original/images' output_folder = '/path/to/resized/images' target_size = (512, 512) # set your desired target size
if not os.path.exists(output_folder): os.makedirs(output_folder)
for file_name in os.listdir(input_folder): # open image file with Image.open(os.path.join(input_folder, file_name)) as img: # resize the image while maintaining aspect ratio img.thumbnail(target_size, Image.ANTIALIAS) # save the resized image img.save(os.path.join(output_folder, file_name))
This code will resize all the images in the input_folder to the target_size and save them in the output_folder. You can modify the target_size to match your requirements.
Note that resizing the images may result in a loss of some image details, depending on the degree of resizing. Therefore, you may need to experiment with different target sizes to find the optimal balance between image size and image quality for your specific use case.