DALI
DALI copied to clipboard
TypeError: 'DataNode' object does not support item assignment
Describe the question.
@pipeline_def(device_id=0)
def dali_pipeline(image_dir):
images, _ = fn.readers.file(file_root=image_dir,
files=utils.list_images(image_dir),
random_shuffle=True,
name="Reader")
H, W = 256, 256
images = fn.decoders.image(images, device="mixed", output_type=types.RGB)
images = images / 255
images = fn.resize(images, size=512)
images = fn.crop_mirror_normalize(images,
dtype=types.FLOAT,
output_layout="HWC",
crop=(H, W),
crop_pos_x=fn.random.uniform(range=(0, 1)),
crop_pos_y=fn.random.uniform(range=(0, 1)))
images = fn.python_function(images, function=rgb2lab)
images = fn.transpose(images, perm=[2, 0, 1])
color = images[1:, :, :] / 110
black = (fn.expand_dims(images[0, :, :], axes=0) - 50) / 100
#How to loop through each image ?
for per image:
g = Geometric(1/8)
P = np.random.choice([1, 2, 3, 4, 5, 6, 7, 8, 9])
mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
for i in range(int(g.sample().item())):
h = int(torch.clip(torch.normal(mean=torch.tensor((H-P+1)/2.), std=torch.tensor((H-P+1)/4.)), 0, W-P))
w = int(torch.clip(torch.normal(mean=torch.tensor((W-P+1)/2.), std=torch.tensor((W-P+1)/4.)), 0, W-P))
# Error here
mask[:,h:h+P,w:w+P] = fn.reductions.mean(fn.reductions.mean(color[:,h:h+P,w:w+P],axes=2,keep_dims=True),axes=1,keep_dims=True)
black = fn.cat(black, mask, axis=0)
return black, color
My dali pipeline is the above and I have 2 questions:
- How do I loop and calculate a
maskfor each image? Is there any efficient way of doing it outside of loop ? Maskis a DataNode and does not support item assignment. How can I resolve this issue ? I tried to create amaskas a Pytorch tensor and usedtorch.mean()butcoloris a DataNode which does not work with PyTorch
Check for duplicates
- [X] I have searched the open bugs/issues and have found no duplicates for this bug report
Hi @jackdaw213,
Thank you for reaching out. A couple of observations from our side:
- you shouldn't loop over the images in a loop - in DALI batch is implicit and each operation is applied to all samples in it. If you want different processing for some samples please check the conditional execution
- DALI doesn't support loops inside the pipeline that are evaluated in the runtime. So
for i in range(3)will work, butfor i in range(fn.random....)won't DataNode and does not support item assignment- you can create multiple pieces of the mask and then cat them together like here
Hello @JanuszL, thank you for the answers
you shouldn't loop over the images in a loop - in DALI batch is implicit and each operation is applied to all samples in it. If you want different processing for some samples please check the conditional execution
The operation is to create g.sample() number of P sized square then calculate the average color inside those squares and cat it with black, this operation is done for each image. I think I can figure a workaround for g.sample() but I do not know how to approach the average color mask for each image step. Any tips ?
DALI doesn't support loops inside the pipeline that are evaluated in the runtime. So for i in range(3) will work, but for i in range(fn.random....) won't
Ah that's unfortunate, thank you for the info
DataNode and does not support item assignment - you can create multiple pieces of the mask and then cat them together like here
I don't know if cat can replace the item assignments in this situation or maybe I do not understand the example correctly. Color is the 2 channels AB in LAB image and I want to sample multiples P size squares and average the color inside those squares. Then cat the mask with those squares to L/black but I don't really know how to cat multiple squares individually to the mask from the examples you gave me
Hello @jackdaw213,
If your loop isn't too long, then it should be possible to unroll it to the maximum length. With that you could use conditional execution to emulate shorter loops. Inside the loop you could generate the mean values and your square coordinates which you could then stack and pass to fn.erase.
Hello @mzient, thank you for your response
If your loop isn't too long, then it should be possible to unroll it to the maximum length. With that you could use conditional execution to emulate shorter loops
That's a good idea but sometimes the loop would get quite lengthy so that's not suitable for this situation
Inside the loop you could generate the mean values and your square coordinates which you could then stack and pass to fn.erase.
This will create the same mask with the same square size/location for all images in the batch right ? But I want a different mask for each image in the batch, is it possible ?
Hello @mzient, thank you for your response
If your loop isn't too long, then it should be possible to unroll it to the maximum length. With that you could use conditional execution to emulate shorter loops
That's a good idea but sometimes the loop would get quite lengthy so that's not suitable for this situation
How long could it get?
Inside the loop you could generate the mean values and your square coordinates which you could then stack and pass to fn.erase.
This will create the same mask with the same square size/location for all images in the batch right ? But I want a different mask for each image in the batch, is it possible ?
No; the batch in DALI is implicit. When you do: slice = img[t:b, l:r], you're in fact slicing all images - and if l, t, r, b are DataNodes, not constants, they too are batches. With explicit batch this would be:
slice = [
img[i][t[i]:b[i], l[i]:r[i]] for i in range(batch_size)
]
How long could it get?
Oops, I somehow think that your idea was to check for every possible value of g.sample() instead of just checking if i < g.sample() (guess I was too sleepy that night). And yes the loop is not that long, this would work just fine, thank you
No; the batch in DALI is implicit. When you do: slice = img[t:b, l:r], you're in fact slicing all images - and if l, t, r, b are DataNodes, not constants, they too are batches. With explicit batch this would be:
Ah, that cleared up some of my misunderstandings. However, I still do not understand your intention of generating a mask for each image. Now I know that fn.reductions.mean(color[:,h:h+P,w:w+P]) calculates the mean values of that square for every image in the batch. Wouldn't the loop generate coordinates/means for a single mask ?
Hello @mzient, I trying to make the loop work, but there is a bug that I can't seem to fix: TypeError: float() argument must be a string or a real number, not 'DataNodeDebug'. The variables x, y, and P are all DataNode, so fn.erase should just work like the example you gave me, right?
g = Geometric(1/8)
sample = int(g.sample().item())
P = fn.random.uniform(range=[1, 9], shape=(), dtype=types.UINT8)
masks = []
mean = None
for i in range(100):
if i > sample:
break
x = fn.cast(fn.random.normal(mean=(H-P+1)/2.,
stddev=(H-P+1)/4.),
dtype=types.UINT8)
y = fn.cast(fn.random.normal(mean=(W-P+1)/2.,
stddev=(W-P+1)/4.),
dtype=types.UINT8)
mean = fn.reductions.mean(fn.reductions.mean(color[:, x:x+P, y:y+P], axes=2, keep_dims=True), axes=1, keep_dims=True)
mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
mask = fn.erase(mask, fill_value=mean, anchor=(x, y), shape=(P, P), axes=(1, 2)) <--- Error here
masks.append(mask)
black = fn.cat(black, fn.stack(*masks), axis=0)
Hi @jackdaw213,
anchor and shape should be a data node/tensor with the right dimensionality, not a tuple of them. Please stack/cat them together and into one value and try again.
Hello @JanuszL, modified my code a bit and added fn.stack to my code but there are some issues
g = Geometric(1/8)
sample = int(g.sample().item())
P = fn.random.uniform(range=[1, 9], shape=(), dtype=types.UINT8)
mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
mean = None
for i in range(100):
if i > sample:
break
x = fn.cast(fn.random.normal(mean=(H-P+1)/2.,
stddev=(H-P+1)/4.),
dtype=types.UINT8)
y = fn.cast(fn.random.normal(mean=(W-P+1)/2.,
stddev=(W-P+1)/4.),
dtype=types.UINT8)
x = math.clamp(x, 0, H-P) <-- Error if I try to stack both x, y after clamping them
y = math.clamp(y, 0, W-P)
mean = fn.reductions.mean(fn.reductions.mean(color[:, x:x+P, y:y+P], axes=2, keep_dims=True), axes=1, keep_dims=True)
mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
mask = fn.erase(mask, fill_value=mean, anchor=fn.stack(x, y), shape=fn.stack(P, P), axes=(1, 2)) <-- First error here
black = fn.cat(black, mask, axis=0)
- When I run the code above it gives me this error:
TypeError: RunOperatorGPU(): incompatible function arguments. The following argument types are supported: 1. (self: nvidia.dali.backend_impl.PipelineDebug, arg0: int, arg1: List[nvidia.dali.tensors.TensorListGPU], arg2: Dict[str, nvidia.dali.tensors.TensorListCPU], arg3: int) -> List[nvidia.dali.tensors.TensorListGPU]. From what I understand, it seems likefill_valueneeds to be aninteger. Castingmeantouint8doesn't work, but settingfill_valueto 1 does. Is there any way to make mean thefill_value?meanhas the shape of (2,1,1) - I also tried to clip/clamp
xandyto the range [0, H/W - P] but thenfn.stack(x, y)will throw this errorTypeError: object of type 'DataNode' has no len()
@jackdaw213 I'm quite sure there are some constructs that work in "regular" mode but not in "debug". Can you try to run your code without debugging?
@mzient I turn off debug mode and those 2 issues seem to be gone, however, a new issue pops up. color is a GPU data note so mean is also a GPU data note. But fn.erase needs mean to be a CPU datanote Error while specifying argument 'fill_value'. Named argument inputs to operators must be CPU data nodes. However, a GPU data node was provided. I tried adding device="cpu" to fn.mean but it does not work because An operator with device='cpu' cannot accept GPU inputs. Also from #1176 it seems that DALI does not support GPU to CPU transfer ? Not related to current issue but why is this the case ?
Any ideas @mzient ? I did some more research, but I couldn't find any method to transfer data from the GPU to the CPU.
@jackdaw213 Thank you for checking non-debug pipeline. Currently there's no way to go from GPU to CPU within a single pipeline. We're actively working on relaxing the execution model to allow arbitrary transfers, however, a usable version is still a release or two away.
@jackdaw213 Hello again,
GPU->CPU copy is now possible as an opt-in feature (it requires using a special executor, enabled with exec_dynamic=True in your pipeline_def).
With that you can call .cpu() on a DataNode to perform GPU->CPU transfer.
Also, the acccess to shapes is simplified, with DataNode.shape(), which is returned on CPU.
Hello @mzient, thank you so much for your work. However, the problem persisted
@pipeline_def(device_id=0, exec_dynamic=True)
def dali_pipeline(image_dir, color_peak=False):
images, _ = fn.readers.file(file_root=image_dir,
files=utils.list_images(image_dir),
random_shuffle=True,
name="Reader")
H, W = 256, 256
images = fn.decoders.image(images, device="mixed", output_type=types.RGB)
images = images / 255
images = fn.resize(images, size=512)
images = fn.crop_mirror_normalize(images,
dtype=types.FLOAT,
output_layout="HWC",
crop=(H, W),
crop_pos_x=fn.random.uniform(range=(0, 1)),
crop_pos_y=fn.random.uniform(range=(0, 1)))
images = fn.python_function(images, function=rgb2lab)
images = fn.transpose(images, perm=[2, 0, 1])
color = images[1:, :, :] / 110
black = (fn.expand_dims(images[0, :, :], axes=0) - 50) / 100
if color_peak:
g = Geometric(1/8)
sample = int(g.sample().item())
mask = types.Constant(shape=(2, H, W), value=0, dtype=types.FLOAT, device="gpu")
for i in range(100):
if i > sample:
break
P = fn.random.uniform(range=[1, 9], shape=(), dtype=types.UINT8)
x = fn.cast(fn.random.normal(mean=(H-P+1)/2.,
stddev=(H-P+1)/4.),
dtype=types.UINT8)
y = fn.cast(fn.random.normal(mean=(W-P+1)/2.,
stddev=(W-P+1)/4.),
dtype=types.UINT8)
mean = fn.reductions.mean(fn.reductions.mean(color[:, x:x+P, y:y+P], axes=2, keep_dims=True), axes=1, keep_dims=True)
mask = fn.erase(mask,
fill_value=mean.cpu(), <-- To CPU transfer but still encountered error
anchor=fn.stack(x, y),
shape=fn.stack(P, P),
axes=(1, 2))
black = fn.cat(black, mask, axis=0)
return black, color
The error message:
Traceback (most recent call last): File "/home/jackdaw/GitHub/ImageColorization/main.py", line 85, in
train_loader = DALIGenericIterator( ^^^^^^^^^^^^^^^^^^^^ File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/plugin/pytorch/init.py", line 208, in init _DaliBaseIterator.init( File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/plugin/base_iterator.py", line 215, in init p.build() File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/pipeline.py", line 1057, in build self._init_pipeline_backend() File "/home/jackdaw/.conda/envs/ai/lib/python3.12/site-packages/nvidia/dali/pipeline.py", line 909, in _init_pipeline_backend related_logical_id[op.relation_id] = self._pipe.AddOperator(op.spec, op.name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Error in GPU operator nvidia.dali.fn.erase, which was used in the pipeline definition with the following traceback:File "/home/jackdaw/GitHub/ImageColorization/dataset.py", line 121, in dali_pipeline mask = fn.erase(mask,
encountered:
Error while specifying argument 'fill_value'. Named argument inputs to operators must be CPU data nodes. However, a GPU data node was provided. C++ context: [/opt/dali/dali/pipeline/pipeline.cc:358]
@jackdaw213 Thank you for reporting the issue.
This is indeed a bug that'll be fixed in the December release of DALI.
Meanwhile, as a workaround just pass the result of .cpu() through some cheap CPU operator where your mean would be a positional input.
fill_value=mean.cpu() + 0 should work in your case.
For larger tensors, I'd recommend some fn.reshape or fn.reinterpret, since these operators just produce views.
Edit: The fix has been merged #5732 and will be a part of the next release.
@mzient Thank you for your hard work