DALI icon indicating copy to clipboard operation
DALI copied to clipboard

How to create a tensor in a custom python function within define_graph

Open rachelglenn opened this issue 1 year ago • 4 comments

Describe the question.

How do create new torch tensors and have them go to the correct device. I would like to do things like taking the square of the tensor? I found this example: https://docs.nvidia.com/deeplearning/dali/archives/dali_1_18_0/user-guide/docs/examples/custom_operations/python_operator.html

def edit_images(image1, image2):
    assert image1.shape == image2.shape
    for i in range(c):
        h, w, c = image1.shape
        perturbation = torch.rand(h, w) 
        new_image1 = torch.zeros(h,w,c)
        new_image2 = torch.zeros(h,w,c)
        new_image1[:, :, i] = image1[:, :, i] * torch.square(perturbation)
        new_image2[:, :, i] = image2[:, :,i]  * torch.square(perturbation)
    return new_image1, new_image2

Check for duplicates

  • [X] I have searched the open bugs/issues and have found no duplicates for this bug report

rachelglenn avatar Jul 03 '24 12:07 rachelglenn

Hello @rachelglenn, I strongly advise against using PythonFunction for functionality with good native support. You can get tensors filled with random values with functions from dali.fn.random. Elementwise squaring can be achieved by simply by multiplying the tensors, like:

   # passing image as the argument will cause the function to return an array shaped like the image
   perturbation = fn.random.uniform(image1, range=[0, 1])  # this already includes channel
   pert_squared = perturbation * perturbation
   new_image1 = image1 * pert_squared
   new_image2 = image2 * pert_squared

BTW - it seems like the code is incorrect (swapped lines?):

    for i in range(c):
        h, w, c = image1.shape # c defined here, but loop over range(c) above

mzient avatar Jul 03 '24 13:07 mzient

Still, if you like using torch_python_function, just use torch.cuda.device inside the callable.

JanuszL avatar Jul 03 '24 13:07 JanuszL

Thank you. I really appreciate it. Another question. I have a pipeline class defined. Do I have to specify that the data loaded is to go to the gpu? Load data calls the dali numpy reader. For some reason, the output of the model (data) is on the cpu and not the gpu.

def define_graph(self):
        data= self.load_data() 
        data= data.to('cuda')
        return data

rachelglenn avatar Jul 10 '24 22:07 rachelglenn

Hi @rachelglenn,

To move data from CPU to GPU inside the pipeline, please:

def define_graph(self):
        data= self.load_data() 
        data= data.gpu()
        return data

JanuszL avatar Jul 10 '24 23:07 JanuszL