vision
vision copied to clipboard
Move torchvision.utils.save_image -> torchvision.io.save_image (or introduce this function from scratch)
🚀 The feature
I think the io namespace makes more sense, and it's where torchvision.io.read_image is located...
Motivation, pitch
N/A
Alternatives
No response
Additional context
No response
Yes, I agree. Not sure why it was added on utils but I agree it makes sense to be within io. As this is BC, we should rollout our new deprecation policy and then investigate impact and whether it's worth moving it.
utils.save_image is a very old function (since 0,4 I suppose much before io.write_jpeg), i had pointed this to @NicolasHug too in offline discussion of deprecating it and promote use of io.write_jpeg. or io.write_png
They small difference is utils.,save_image saves an image tensor. Whereas io does over PIIL images. (Anyways the image tensor can be converted to PIL with transforms.toPILImage)
I don't think we should move save_image, in fact I think we should deprecate it. It's not working as users would expect it to work. It adds padding pixels by default, and it relies on make_grid which has a bad API. It's causing confusion to users, see e.g. https://github.com/pytorch/vision/issues/5211 and my reply there.
I don't mind having a completely new function with this functionality in io, and it could redirect the call to write_jpg / write_png accordingly to its arguments / file extension, just like opencv does. A usefsul feature could be supporting batches of images to parallelize image encoding and waiting for file system to complete saving.
In general, I hope the full transition from PIL images to torch tensors for all tasks would be completed soon, especially for io functions.
I also think save_image should be deprecated, it assumes the input is in [0, 1] range and doesn't provide any indication that [0, 255] integer representation is not supported. This is also not explained in the documentation.
Also, batched/parallelized read_image api could save some simple cases where otherwise a full dataset/dataloader would be needed (if practical enough to implement it).
I agree here for deprecating save_image. Any ideas on how to save a uint64 tensor in the range[0,255] as an image?
Related: https://github.com/pytorch/vision/issues/6255
Also, batched/parallelized read_image api could save some simple cases where otherwise a full dataset/dataloader would be needed (if practical enough to implement it).
One in-the-wild usecase for a simple batch image reading function (that would handle parallelism internally): https://github.com/pytorch/pytorch/issues/88978
I also wonder if supporting an out= tensor/list could be a good idea, especially if the output tensor is cuda-pinned
It is rather strange for the user that functions with similar functions are placed in two different modules
Sorry, moving save_image probably isn't going to happen, so I'll close this issue.
We acknowledge that the situation is not ideal and confusing, but moving it would still require a deprecation / disruption, and as explained above in https://github.com/pytorch/vision/issues/5461#issuecomment-1048612641 I don't think save_image should exist (as-is) in the first place. Try using the better-supported alternatives detailed in https://github.com/pytorch/vision/issues/5211#issuecomment-1016255934
@NicolasHug this issue also proposes parallelized/batch image saving (as per title). Does write_jpg already support this feature or is it also wontfix?
Feel free to open a separate an issue for that. We can clarify what parallelism means there.
(I'm not particularly sold on the "batch saving" idea considering it's really just a call to torch.cat())
The idea is to have the I/O and encoding/decoding itself of multiple images being done in parallel
We also discussed in the issue creating of a new dispatching function the the proper torchvision.io namespace (e.g. torchvision.io.write_image similar to functions found in all other frameworks like matlab or opencv), so this issue was not just about moving the existing function (deemed obsolete and should-be-deprecated) to the proper namespace
done in parallel
multithreading? multi-processing?
Neither is going to help much when using the DataLoader(num_workers=N) because num_threads is set to 1 and multiprocessing is already enabled. Apart from during training, what are situations where "parallelism" could be critical?
You are right, inside DataLoader context this is not adding anything useful.
But this is useful in other contexts (i.e. not inside DataLoader workers / when there is no DataLoader instance at all), e.g. when generating visualizations of many images (this is frequent when embedding visualization images in HTML files) or processing a directory of images as a single batch. This can simplify the code in simple scenarios, and this I/O can be paralellized. So yes, some simple python multiprocessing (or better OpenMP in C++ if C++ libtorch is still being maintained)
Okay, I will create new separate issues for these ideas
Is speed absolutely critical in these scenarios? If it is, would it make sense to let user enable that kind of parallelism themselves? joblib seems like the perfect tool for the job here. It supports both a multi-processing and multithreading backend
I consider this a matter of idioms / saving users' time reimplementing the common things / having tested utils. And probably just using multithreading would work, but maybe some dynamic setting of number of threads would be non-trivial. And in any case, it would benefit from some proper testing/measurement.
For the visualization case, it's often lazy to be measuring/profiling plots (often hundreds of them) encoding to jpeg and base64, as this is often a scratch/"write only code to be thrown away". So I always kept them serial, but I suspect that image encoding and base64 encoding actually takes half of the time of running these scripts (the other half is matplotlib actually being slow, at rendering to image included). These scripts end up running many times and each execution can easily start taking dozens of seconds, so at the end takes a lot of wasted time
For the "reading a directory of images for inference/testing" - I think also yes, parallelization can help and offset a lot of CPU-bound decoding and overlapped I/O and is definitely a simpler API than spinning a full DataLoader for this
I highly recommend you to consider joblib @vadimkantorov . I think it does what you need (and more) with 0.01% of the effort it would take to implement a parallel "save_images()" util. Knowing the joblib devs, it's been tested to death and it's just as reliable as anything coming out of pytorch.
Thank you. I recognize there exist good foundations in Python for parallelization. My point is that batched support for image encoding/decoding can fit nicely in the general pattern of supporting batches and handling parallelization / number of thread selection under the hood and make users' lives a bit simpler - even if it just uses joblib under the hood as well.
Of course there are downsides as well, so treat it like a suggestion, no more :)