vision icon indicating copy to clipboard operation
vision copied to clipboard

Setting `-5` and `5` to `sharpness_factor` argument of `RandomAdjustSharpness()` sharpens and blurs images respectively

Open hyperkai opened this issue 11 months ago • 1 comments

📚 The doc issue

The doc of RandomAdjustSharpness() says below:

Adjust the sharpness of the image or video with a given probability. ... Parameters:

  • sharpness_factor (float) – How much to adjust the sharpness. Can be any non-negative number. 0 gives a blurred image, 1 gives the original image while 2 increases the sharpness by a factor of 2.

But setting -5 and 5 to sharpness_factor argument of RandomAdjustSharpness() sharpens and blurs images respectively as shown below:

from torchvision.datasets import OxfordIIITPet
from torchvision.transforms.v2 import RandomAdjustSharpness

sfn5p1_data = OxfordIIITPet( # `sf` is sharpness_factor.
    root="data",             # `n` is negative.
    transform=RandomAdjustSharpness(sharpness_factor=-5, p=1)
)

sf1p1origin_data = OxfordIIITPet(
    root="data",
    transform=RandomAdjustSharpness(sharpness_factor=1, p=1)
)

sf5p1_data = OxfordIIITPet(
    root="data",
    transform=RandomAdjustSharpness(sharpness_factor=5, p=1)
)

import matplotlib.pyplot as plt

def show_images(data, main_title=None):
    plt.figure(figsize=[10, 5])
    plt.suptitle(t=main_title, y=0.8, fontsize=14)
    for i, (im, _) in zip(range(1, 6), data):
        plt.subplot(1, 5, i)
        plt.imshow(X=im)
        plt.xticks(ticks=[])
        plt.yticks(ticks=[])
    plt.tight_layout()
    plt.show()

show_images(data=sfn5p1_data, main_title="sfn5p1_data")
show_images(data=sf1p1origin_data, main_title="sf1p1origin_data")
show_images(data=sf5p1_data, main_title="sf5p1_data")

Image

Image

Image

Suggest a potential alternative/fix

So, the doc of RandomAdjustSharpness() should say something like below:

Sharpen or blur an image or video with a given probability. ... Parameters:

  • sharpness_factor (float) – How much to adjust the sharpness. Can be any negative or non-negative number. x < 1 gives a blurred image, 1 gives the original image while 1 < x gives a sharpened image.

hyperkai avatar Feb 18 '25 01:02 hyperkai

fair - happy to consider a PR @hyperkai

NicolasHug avatar Feb 19 '25 13:02 NicolasHug