Add Transforms Benchmark
- Add
benchmark_transforms.pywhich allows one to compare the execution time of a sequential pipeline of our transforms + kornia augmentations for CPU vs GPU and various batch image/mask sizes.
Overall thoughts:
- How fast are these transforms? I assume ~1 sec or less. We might want to loop over them multiple times to get a more accurate speed measurement.
- Do we want to benchmark the CPU transforms in parallel with multiple workers? Users could use them after data loading or as an argument to
datasetright?
- Depends. CPU/GPU are quite similar for small images and small batch sizes, as expected. For the opposite case, GPU will outperform CPU greatly. However it's not exactly an apples to apples comparison as you mention because CPU, in this case, is not taking advantage of any parallelism.
- No we don't. This is a tricky one to measure but I think we can use a
torch.utils.data.TensorDatasetso there should be no overhead for I/O when loading a batch using multiple workers since the tensors will already be loaded to memory.
Let's not merge this yet so I can implement a true comparison per my above comment for GPU augs vs CPU augs with multiple workers.
Also, other benchmark scripts and experiment scripts are missing the license comment, just a heads up.
Let's not merge this yet so I can implement a true comparison per my above comment for GPU augs vs CPU augs with multiple workers.
let us know about this and see if we can be of any help here. We are currently refactoring the augmentations module just reorganizing code for further improvements /cc @shijianjian @twsl
@isaaccorley is this still a WIP or should we close it?
This is still a WIP but it's close to done. I think the only thing that's left is to figure out how we want to compare to CPU in an apples to apples way.