taco icon indicating copy to clipboard operation
taco copied to clipboard

need docs for .assemble() scheduling directive

Open Infinoid opened this issue 3 years ago • 0 comments

The .assemble() scheduling directive was added fairly recently, it seems to control the mechanism by which values are written to sparse output tensors.

This directive isn't mentioned on the scheduling page of the website, and it isn't mentioned in the taco -help=scheduling text either. If someone could write something for the web docs, I'd be happy to update the command line tool accordingly.

I also see that the web tool supports this scheduling directive, though it omits the separately_schedulable flag.

I played with this directive a bit, and it seems to work. I managed to get a "Precondition failed: Ungrouped insertion not support for output tensors that are scattered into" error in some cases. I didn't fully understand that, but I managed to solve it with a reorder.

It would be good if we could describe:

  • what does it do?
  • how does it affect performance?
  • how does it affect parallelism?
  • what are the restrictions on its use?
  • is this more useful for some sparse formats than others, or some styles of parallelism than others?
  • can we provide a practical example or two?
  • what does the separately_schedulable flag mean?

Infinoid avatar Jun 04 '21 15:06 Infinoid