Halide icon indicating copy to clipboard operation
Halide copied to clipboard

Odd variable/buffer names in conceptual_stmt.

Open mcourteaux opened this issue 1 year ago • 5 comments

I get a lot of .0, .1, .2, .4, ... as suffixes in buffer names in the conceptual stmt, which would be nice if it can be cleaned up. Haven't taken the time to figure out where they come from.

A few examples:

image

When using rfactor() for a summation over a 2D RDom:

image

A bit of everything:

image

image

Where in the above screenshot, the red arrows are all the places where the buffers have the suffix, yet the producer/consumer name is clean, as indicated by the green arrows.

mcourteaux avatar Oct 28 '23 10:10 mcourteaux

Re: .0, .1 notations. Do you mean the Tuple datatype? I also saw these suffixes in the IR of complex number datatype implemented in apps/fft/complex.h.

See also the pinned issue "Halide development roadmap": #5055 . Quoting the key summary:

Generators with multiple outputs (there's a trade-off between tuples, extra channels, compute_with)

antonysigma avatar Oct 28 '23 20:10 antonysigma

I think it's injected here: https://github.com/halide/Halide/blob/main/src/FuseGPUThreadLoops.cpp#L480

and I think the reason is that when allocations are hoisted to the gpu block level, they might have name collisions with other allocations that now have the same scope, and this numbering makes that impossible.

abadams avatar Oct 29 '23 20:10 abadams

Aha, that sounds reasonable. Any idea why the rfactor() intermediate Func allocs a float[1] array for every float, instead of a single float array for all of them.

~~Could it be that the Func is computed at a deeper nesting level, but the GPU lowering pass moves up allocations to the innermost GPU-thread loop, and therefore the Allocate node gets duplicated outside of the loop? I haven't really investigated, and I'm also not sure if my vague theory could hold for loops with non-statically-known bounds.~~ EDIT: I doubt it. The schedule looks like this is not the case.

mcourteaux avatar Oct 30 '23 09:10 mcourteaux

I think the reason is that when allocations are hoisted to the gpu block level, they might have name collisions with other allocations that now have the same scope

So it should be allowed to only add those numbers in case of collisions? Maybe one day I'll feel like trying to implement that cheap change with a std::map<std::string, int> and make a PR.

mcourteaux avatar Oct 30 '23 09:10 mcourteaux

I can't assign people (or add labels, or ask for reviews or anything), but I'd assign myself if I could.

mcourteaux avatar Oct 30 '23 10:10 mcourteaux