DALI icon indicating copy to clipboard operation
DALI copied to clipboard

TypeError: 'DataNode' object does not support item assignment

Open jackdaw213 opened this issue 1 year ago • 13 comments

Describe the question.

@pipeline_def(device_id=0)
    def dali_pipeline(image_dir):
        images, _ = fn.readers.file(file_root=image_dir, 
                                    files=utils.list_images(image_dir),
                                    random_shuffle=True, 
                                    name="Reader")
        H, W = 256, 256
        
        images = fn.decoders.image(images, device="mixed", output_type=types.RGB)
        images = images / 255
        images = fn.resize(images, size=512)
        images = fn.crop_mirror_normalize(images, 
                                        dtype=types.FLOAT,
                                        output_layout="HWC",
                                        crop=(H, W),
                                        crop_pos_x=fn.random.uniform(range=(0, 1)),
                                        crop_pos_y=fn.random.uniform(range=(0, 1)))

        images = fn.python_function(images, function=rgb2lab)

        images = fn.transpose(images, perm=[2, 0, 1])
        color = images[1:, :, :] / 110
        black = (fn.expand_dims(images[0, :, :], axes=0) - 50) / 100

        #How to loop through each image ?
        for per image:
            g = Geometric(1/8)
            P = np.random.choice([1, 2, 3, 4, 5, 6, 7, 8, 9])
            mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
            
            for i in range(int(g.sample().item())):
                h = int(torch.clip(torch.normal(mean=torch.tensor((H-P+1)/2.), std=torch.tensor((H-P+1)/4.)), 0, W-P))
                w = int(torch.clip(torch.normal(mean=torch.tensor((W-P+1)/2.), std=torch.tensor((W-P+1)/4.)), 0, W-P))
                
                # Error here
                mask[:,h:h+P,w:w+P] = fn.reductions.mean(fn.reductions.mean(color[:,h:h+P,w:w+P],axes=2,keep_dims=True),axes=1,keep_dims=True)
        
        black = fn.cat(black, mask, axis=0)

        return black, color

My dali pipeline is the above and I have 2 questions:

  1. How do I loop and calculate a mask for each image? Is there any efficient way of doing it outside of loop ?
  2. Mask is a DataNode and does not support item assignment. How can I resolve this issue ? I tried to create a mask as a Pytorch tensor and used torch.mean() but color is a DataNode which does not work with PyTorch

Check for duplicates

  • [X] I have searched the open bugs/issues and have found no duplicates for this bug report

jackdaw213 avatar May 14 '24 07:05 jackdaw213

Hi @jackdaw213,

Thank you for reaching out. A couple of observations from our side:

  • you shouldn't loop over the images in a loop - in DALI batch is implicit and each operation is applied to all samples in it. If you want different processing for some samples please check the conditional execution
  • DALI doesn't support loops inside the pipeline that are evaluated in the runtime. So for i in range(3) will work, but for i in range(fn.random....) won't
  • DataNode and does not support item assignment - you can create multiple pieces of the mask and then cat them together like here

JanuszL avatar May 14 '24 16:05 JanuszL

Hello @JanuszL, thank you for the answers

you shouldn't loop over the images in a loop - in DALI batch is implicit and each operation is applied to all samples in it. If you want different processing for some samples please check the conditional execution

The operation is to create g.sample() number of P sized square then calculate the average color inside those squares and cat it with black, this operation is done for each image. I think I can figure a workaround for g.sample() but I do not know how to approach the average color mask for each image step. Any tips ?

DALI doesn't support loops inside the pipeline that are evaluated in the runtime. So for i in range(3) will work, but for i in range(fn.random....) won't

Ah that's unfortunate, thank you for the info

DataNode and does not support item assignment - you can create multiple pieces of the mask and then cat them together like here

I don't know if cat can replace the item assignments in this situation or maybe I do not understand the example correctly. Color is the 2 channels AB in LAB image and I want to sample multiples P size squares and average the color inside those squares. Then cat the mask with those squares to L/black but I don't really know how to cat multiple squares individually to the mask from the examples you gave me

jackdaw213 avatar May 15 '24 02:05 jackdaw213

Hello @jackdaw213, If your loop isn't too long, then it should be possible to unroll it to the maximum length. With that you could use conditional execution to emulate shorter loops. Inside the loop you could generate the mean values and your square coordinates which you could then stack and pass to fn.erase.

mzient avatar May 15 '24 09:05 mzient

Hello @mzient, thank you for your response

If your loop isn't too long, then it should be possible to unroll it to the maximum length. With that you could use conditional execution to emulate shorter loops

That's a good idea but sometimes the loop would get quite lengthy so that's not suitable for this situation

Inside the loop you could generate the mean values and your square coordinates which you could then stack and pass to fn.erase.

This will create the same mask with the same square size/location for all images in the batch right ? But I want a different mask for each image in the batch, is it possible ?

jackdaw213 avatar May 15 '24 14:05 jackdaw213

Hello @mzient, thank you for your response

If your loop isn't too long, then it should be possible to unroll it to the maximum length. With that you could use conditional execution to emulate shorter loops

That's a good idea but sometimes the loop would get quite lengthy so that's not suitable for this situation

How long could it get?

Inside the loop you could generate the mean values and your square coordinates which you could then stack and pass to fn.erase.

This will create the same mask with the same square size/location for all images in the batch right ? But I want a different mask for each image in the batch, is it possible ?

No; the batch in DALI is implicit. When you do: slice = img[t:b, l:r], you're in fact slicing all images - and if l, t, r, b are DataNodes, not constants, they too are batches. With explicit batch this would be:

slice = [
  img[i][t[i]:b[i], l[i]:r[i]] for i in range(batch_size)
]

mzient avatar May 16 '24 13:05 mzient

How long could it get?

Oops, I somehow think that your idea was to check for every possible value of g.sample() instead of just checking if i < g.sample() (guess I was too sleepy that night). And yes the loop is not that long, this would work just fine, thank you

No; the batch in DALI is implicit. When you do: slice = img[t:b, l:r], you're in fact slicing all images - and if l, t, r, b are DataNodes, not constants, they too are batches. With explicit batch this would be:

Ah, that cleared up some of my misunderstandings. However, I still do not understand your intention of generating a mask for each image. Now I know that fn.reductions.mean(color[:,h:h+P,w:w+P]) calculates the mean values of that square for every image in the batch. Wouldn't the loop generate coordinates/means for a single mask ?

jackdaw213 avatar May 17 '24 13:05 jackdaw213

Hello @mzient, I trying to make the loop work, but there is a bug that I can't seem to fix: TypeError: float() argument must be a string or a real number, not 'DataNodeDebug'. The variables x, y, and P are all DataNode, so fn.erase should just work like the example you gave me, right?

g = Geometric(1/8)
sample = int(g.sample().item())
P = fn.random.uniform(range=[1, 9], shape=(), dtype=types.UINT8)
masks = []
mean = None

for i in range(100):
    if i > sample:
        break

    x = fn.cast(fn.random.normal(mean=(H-P+1)/2., 
                                            stddev=(H-P+1)/4.), 
                                            dtype=types.UINT8)
    y = fn.cast(fn.random.normal(mean=(W-P+1)/2., 
                                            stddev=(W-P+1)/4.), 
                                            dtype=types.UINT8)

    mean = fn.reductions.mean(fn.reductions.mean(color[:, x:x+P, y:y+P], axes=2, keep_dims=True), axes=1, keep_dims=True)

    mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
    mask = fn.erase(mask, fill_value=mean, anchor=(x, y), shape=(P, P), axes=(1, 2)) <--- Error here
    masks.append(mask)
    
black = fn.cat(black, fn.stack(*masks), axis=0)

jackdaw213 avatar May 21 '24 13:05 jackdaw213

Hi @jackdaw213,

anchor and shape should be a data node/tensor with the right dimensionality, not a tuple of them. Please stack/cat them together and into one value and try again.

JanuszL avatar May 21 '24 15:05 JanuszL

Hello @JanuszL, modified my code a bit and added fn.stack to my code but there are some issues

g = Geometric(1/8)
sample = int(g.sample().item())
P = fn.random.uniform(range=[1, 9], shape=(), dtype=types.UINT8)
mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
mean = None

for i in range(100):
    if i > sample:
        break

    x = fn.cast(fn.random.normal(mean=(H-P+1)/2., 
                                            stddev=(H-P+1)/4.), 
                                            dtype=types.UINT8)
    y = fn.cast(fn.random.normal(mean=(W-P+1)/2., 
                                            stddev=(W-P+1)/4.), 
                                            dtype=types.UINT8)

    x = math.clamp(x, 0, H-P) <-- Error if I try to stack both x, y after clamping them
    y = math.clamp(y, 0, W-P)

    mean = fn.reductions.mean(fn.reductions.mean(color[:, x:x+P, y:y+P], axes=2, keep_dims=True), axes=1, keep_dims=True)

    mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
    mask = fn.erase(mask, fill_value=mean, anchor=fn.stack(x, y), shape=fn.stack(P, P), axes=(1, 2)) <-- First error here
    
black = fn.cat(black, mask, axis=0)
  1. When I run the code above it gives me this error: TypeError: RunOperatorGPU(): incompatible function arguments. The following argument types are supported: 1. (self: nvidia.dali.backend_impl.PipelineDebug, arg0: int, arg1: List[nvidia.dali.tensors.TensorListGPU], arg2: Dict[str, nvidia.dali.tensors.TensorListCPU], arg3: int) -> List[nvidia.dali.tensors.TensorListGPU]. From what I understand, it seems like fill_value needs to be an integer. Casting mean to uint8 doesn't work, but setting fill_value to 1 does. Is there any way to make mean the fill_value? mean has the shape of (2,1,1)
  2. I also tried to clip/clamp x and y to the range [0, H/W - P] but then fn.stack(x, y) will throw this error TypeError: object of type 'DataNode' has no len()

jackdaw213 avatar May 22 '24 12:05 jackdaw213

@jackdaw213 I'm quite sure there are some constructs that work in "regular" mode but not in "debug". Can you try to run your code without debugging?

mzient avatar May 22 '24 12:05 mzient

@mzient I turn off debug mode and those 2 issues seem to be gone, however, a new issue pops up. color is a GPU data note so mean is also a GPU data note. But fn.erase needs mean to be a CPU datanote Error while specifying argument 'fill_value'. Named argument inputs to operators must be CPU data nodes. However, a GPU data node was provided. I tried adding device="cpu" to fn.mean but it does not work because An operator with device='cpu' cannot accept GPU inputs. Also from #1176 it seems that DALI does not support GPU to CPU transfer ? Not related to current issue but why is this the case ?

jackdaw213 avatar May 22 '24 23:05 jackdaw213

Any ideas @mzient ? I did some more research, but I couldn't find any method to transfer data from the GPU to the CPU.

jackdaw213 avatar May 28 '24 01:05 jackdaw213

@jackdaw213 Thank you for checking non-debug pipeline. Currently there's no way to go from GPU to CPU within a single pipeline. We're actively working on relaxing the execution model to allow arbitrary transfers, however, a usable version is still a release or two away.

mzient avatar May 28 '24 08:05 mzient

@jackdaw213 Hello again, GPU->CPU copy is now possible as an opt-in feature (it requires using a special executor, enabled with exec_dynamic=True in your pipeline_def). With that you can call .cpu() on a DataNode to perform GPU->CPU transfer. Also, the acccess to shapes is simplified, with DataNode.shape(), which is returned on CPU.

mzient avatar Nov 27 '24 08:11 mzient

Hello @mzient, thank you so much for your work. However, the problem persisted

@pipeline_def(device_id=0, exec_dynamic=True)
  def dali_pipeline(image_dir, color_peak=False):
      images, _ = fn.readers.file(file_root=image_dir, 
                                  files=utils.list_images(image_dir),
                                  random_shuffle=True, 
                                  name="Reader")
      H, W = 256, 256
      
      images = fn.decoders.image(images, device="mixed", output_type=types.RGB)
      images = images / 255

      images = fn.resize(images, size=512)

      images = fn.crop_mirror_normalize(images, 
                                      dtype=types.FLOAT,
                                      output_layout="HWC",
                                      crop=(H, W),
                                      crop_pos_x=fn.random.uniform(range=(0, 1)),
                                      crop_pos_y=fn.random.uniform(range=(0, 1)))

      images = fn.python_function(images, function=rgb2lab)

      images = fn.transpose(images, perm=[2, 0, 1])

      color = images[1:, :, :] / 110
      black = (fn.expand_dims(images[0, :, :], axes=0) - 50) / 100
      
      if color_peak:
          g = Geometric(1/8)
          sample = int(g.sample().item())
          mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
          
          for i in range(100):
              if i > sample:
                  break

              P = fn.random.uniform(range=[1, 9], shape=(), dtype=types.UINT8)

              x = fn.cast(fn.random.normal(mean=(H-P+1)/2., 
                                          stddev=(H-P+1)/4.), 
                                          dtype=types.UINT8)
              
              y = fn.cast(fn.random.normal(mean=(W-P+1)/2., 
                                          stddev=(W-P+1)/4.), 
                                          dtype=types.UINT8)
              
              mean = fn.reductions.mean(fn.reductions.mean(color[:, x:x+P, y:y+P], axes=2, keep_dims=True), axes=1, keep_dims=True)
              mask = fn.erase(mask, 
                          fill_value=mean.cpu(), <-- To CPU transfer but still encountered error
                          anchor=fn.stack(x, y), 
                          shape=fn.stack(P, P), 
                          axes=(1, 2))
              
          black = fn.cat(black, mask, axis=0)

      return black, color

The error message:

Traceback (most recent call last): File "/home/jackdaw/GitHub/ImageColorization/main.py", line 85, in train_loader = DALIGenericIterator( ^^^^^^^^^^^^^^^^^^^^ File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/plugin/pytorch/init.py", line 208, in init _DaliBaseIterator.init( File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/plugin/base_iterator.py", line 215, in init p.build() File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/pipeline.py", line 1057, in build self._init_pipeline_backend() File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/pipeline.py", line 909, in _init_pipeline_backend related_logical_id[op.relation_id] = self._pipe.AddOperator(op.spec, op.name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Error in GPU operator nvidia.dali.fn.erase, which was used in the pipeline definition with the following traceback:

File "/home/jackdaw/GitHub/ImageColorization/dataset.py", line 121, in dali_pipeline mask = fn.erase(mask,

encountered:

Error while specifying argument 'fill_value'. Named argument inputs to operators must be CPU data nodes. However, a GPU data node was provided. C++ context: [/opt/dali/dali/pipeline/pipeline.cc:358]

jackdaw213 avatar Nov 30 '24 00:11 jackdaw213

@jackdaw213 Thank you for reporting the issue. This is indeed a bug that'll be fixed in the December release of DALI. Meanwhile, as a workaround just pass the result of .cpu() through some cheap CPU operator where your mean would be a positional input. fill_value=mean.cpu() + 0 should work in your case. For larger tensors, I'd recommend some fn.reshape or fn.reinterpret, since these operators just produce views.

Edit: The fix has been merged #5732 and will be a part of the next release.

mzient avatar Dec 02 '24 09:12 mzient

@mzient Thank you for your hard work

jackdaw213 avatar Dec 10 '24 05:12 jackdaw213